Add files via upload (#24)

Added State Department FOIA URLs
The 458,130 URLs included in the files USAStateFOIA_pdf_urls_part1.txt and USAStateFOIA_pdf_urls_part2.txt are derived from the US State Departments Freedom of Information Act (FOIA) Virtual Reading Room search database. The website serves thousands of Collections as Search Pages (2,940) and 455,190 endpoint PDFs. 

2,940 Search Pages
Examples
http://foia.state.gov/Search/Results.aspx?collection=Clinton_Email_February_29_Release
http://foia.state.gov/Search/Results.aspx?collection=Litigation_F-2016-07895_6
https://foia.state.gov/Search/Results.aspx?caseNumber=F-1991-05139
https://foia.state.gov/Search/Results.aspx?IRIA.aspx
https://foia.state.gov/Search/Results.aspx?Microfiche.aspx


455,190 PDF URLs
Examples
https://foia.state.gov/DOCUMENTS/1-FY2012/F-2004-02207/DOC_0C17731327/C17731327.pdf
https://foia.state.gov/DOCUMENTS/FOIA_Micro_Aug2024_6/F-1986-01832/DOC_0C09000001/C09000001.pdf
https://foia.state.gov/DOCUMENTS/FOIA_Micro_Oct2024_7/F-1989-00718/DOC_0C09000006/C09000006.pdf
https://foia.state.gov/DOCUMENTS/Litigation/HRCLitigation_1/JW7 RD4 02-24-2014 - 1 of 7.sdhdhpdf_Part1.pdf
https://foia.state.gov/DOCUMENTS\Argentina\0000AFA2.pdf
This commit is contained in:
Antoine McGrath 2025-01-02 11:22:58 -06:00 committed by GitHub
parent 3e95bc46a9
commit 9c4b1910fd
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 458132 additions and 0 deletions

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff