March 2019
This is a wonderful R package, thank you!
One question, can anyone tell me what the “space” column value of TRUE or FALSE means precisely, when using the pdf_data function? I haven’t been able to locate any information on this searching on poppler, pdftools, etc. …
March 2019
Actually I’m not entirely sure. From google I found:
hasSpaceAfter()
will tell you the end of line when returning False
.
So that may be it.
1 reply
March 2019
▶ jeroenooms
Thank you. I thought that this was the case (namely for a set of common “y” coordinate-valued rows forming a line, the maximum x value (rightmost word) would have space == FALSE). But I do get exceptions where common y-values have more than one FALSE value for “space”. Which leads me to think that the y-coordinate value cannot be thought of as a “line” strictly – or the “space” logical value signifies something more subtle?
I’ll search “hasSpaceAfter” for more information, thank you 
June 2019
Hi any tips on how to transform the pdf_data()
output into the original “table-like” structure?
1 reply
July 2020
Error with pdf_data()
item_dt <- pdf_data(pdf)[[7]]
Error in normalizePath(pdf, mustWork = TRUE) :
path[1]=" Federal, State, and Local Governments
2017 State and Local Government Finances
Technical Documentation
Individual Unit Data File (Public Use Format)
This is an ASCII fixed length text file. It contains amount for each finance item code within each
government unit for all respondents and non-respondents in the sample. This large file can be useful
for programming and database applications.
For 2017, the file name is 2017FinEstDAT_02202020modp_pu.txt and contains a standard 34-
character public-use format record layout. It is about 59 megabytes. Below is a detailed record
layout for the file.
This happens with every page, what does it mean? Thanks!
1 reply
July 2020
▶ Eric
Sorry for the delay. Can you be more specific? What do you mean by the original table like structure? Can you give an example?
July 2020
▶ erica-grabowski
pdf_data
expects a file path or raw vector. It looks like you probably passed in a character string instead, that is, your variable pdf
is probably a string, correct? try passing a file path instead
January 2021
January 2021
▶ lizlaw
Hi @lizlaw!
(rOpenSci Community Assistant here) Cool use of the pdftools package by Jeroen Ooms!
Would you consider adding this use case (description and code snippet or link to code/post) to the use case forum?
discuss.ropensci.org/c/usecases/
There’s a template to help & we tweet to help share applications of rOpenSci pkgs!
1 reply
January 2021
▶ steffilazerte
Done - thanks for the suggestion!
May 2021
December 2021
This is not an issue or a suggestion. I just wanted to say this package has saved me hours of work. Thank you for all the effort. It really makes a difference.
December 2021
@SivuyileNzimeni Thank you so much for taking the time to share your appreciation!