 Extracting data from documents with python is not only fun but also saves ton of time. Python provides tools for automating such repetitive tasks and also many libraries that let us interact with documents programmatically. I have multiple scripts that does just that, extract data from hundreds of documents, clean data, and present in a more useful format. All of this can be automated and done with a click of a button. Alternative would be spending hours scanning through documents manually. Over the time things change. The data we need change, structure of documents we use change, the goals change. This may require revisiting and updating scripts. This becomes a bit more challenging if it has been a while since we wrote the scripts. This has been the case for me again this week. I had a project to revisit some data extracting scripts because the structure of the documents used have changed over time. While everything worked as expected, tweaking the data extracting and processing could improve the desired output. Python has many libraries that deal with pdf documents. Pdfplumber is my favorite one and I have used many times. One feature that it has I haven't experimented with yet was the **Visual Debugging**. It is very simple process and using it saves a lot of time when writing the actual data extraction code from these documents. Sometimes when you extract data from PDFs, the results donβt match what you see on the page. For example, tables might look scrambled or text could be out of order. Visual debugging with pdfplumber lets you see how your code interprets the document so you can fix mistakes quickly. If you don't have pdfplumber installed yet, make sure to pip install first. Extracting text from pdf documents is as simple as displayed below with few lines of code. ``` import pdfplumber with pdfplumber.open("example.pdf") as pdf: first_page = pdf.pages[0] print(first_page.extract_text()) ``` The code above gets all text on the page. However, we may want to get text only in specific locations on the page. For this we can use **.crop(bounding_box, relative=False, strict=True)** method. Using this method on the page we are working on will return a version of the page but only including items within the bounding box location we have provided with x and y coordinates. I just create a helper function like below to crop the areas I need. All we need to do is figure out our bounding box coordinates. ``` def get_rect_text(page, bounding_box): text = page.crop(bounding_box).extract_text().split('\n') return text ``` We can guess where approximately the x, y, top, bottom are and play with numbers until we get what we need. But this may create errors in the future, but also can be a very boring process of trying different numbers. Alternatively, we can utilize visual debugging features pdfplumber provides to visually see where things are. The simplest way would be drawing lines horizontally and vertically, kinda creating a grid and then figuring out what these numbers are super simple. Plugging in these numbers we can crop any area we need, and keep repeating the same process for all the pages and documents as needed. ``` def pdf_draw_lines(filename): with pdfplumber.open(filename) as pdf: count = 1 for page in pdf.pages: page_img = page.to_image(resolution=250) page_img.draw_line(((60,0), (60,800)), stroke='red', stroke_width=1) page_img.draw_line(((63,0), (63,800)), stroke='blue', stroke_width=1) page_img.draw_line(((110,0), (110,800)), stroke='red', stroke_width=1) page_img.draw_line(((113,0), (113,800)), stroke='blue', stroke_width=1) page_img.save(f'/location/doc{count}.png', format="PNG", quantize=True, colors=256, bits=8) count += 1 ``` Above you can see small function that draws lines on each page of the documents and saves pages locally. We can examine these pictures of the documents to get a better understanding the structure of the document and plan how we will be extracting and using the data. Drawing horizontal and vertical lines is the simplest way for us to visually debug the documents. pdfplumber provides much more interesting and powerful ways of accomplishing these tasks. Feel free to visit the [pdfplumber documentation](https://github.com/jsvine/pdfplumber) for more details. This didn't work for me right away. I did get errors initially that complained I don't have the imagePage related dependencies on the machine. This wasn't just a pip install. The error suggested what to install and it took a while for it to complete the installation. In the end everything worked, except for **.show()** method. I didn't need, since I could just save the images and view them afterwards. Pdfplumber works great with other Python libraries, like pandas, for handling data. For example, if you extract a table from a PDF, you can turn it into a pandas DataFrame to clean or analyze the data more easily. Debugging with pdfplumber ensures the data is clean before you move to the next steps. Pdfplumber is a simple yet powerful tool for working with PDFs. Itβs especially useful for beginners because it gives you visual feedback, making it easier to see whatβs happening and fix issues. Whether youβre working with text, tables, or images, pdfplumber helps make the process smoother and more reliable.
author | geekgirl |
---|---|
permlink | visual-debugging-pdf-documents-with-pdfplumber |
category | python |
json_metadata | {"tags":["python","programming","pdfplumber","coding","proofofbrain"],"image":["https://images.hive.blog/DQmNojFSEbZhEpxtYXh4PD9HmUTfq3ro52SJTerLc5RSSec/pdfplumber.png"],"links":["https://github.com/jsvine/pdfplumber"],"app":"hiveblog/0.1","format":"markdown"} |
created | 2024-12-12 19:38:21 |
last_update | 2024-12-12 19:38:21 |
depth | 0 |
children | 20 |
last_payout | 2024-12-19 19:38:21 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 23.445 HBD |
curator_payout_value | 23.407 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 5,526 |
author_reputation | 1,586,488,611,824,452 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,162,215 |
net_rshares | 127,851,495,283,677 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
steempty | 0 | 19,810,800,529,236 | 60% | ||
tuck-fheman | 0 | 5,315,061,843 | 100% | ||
leprechaun | 0 | 3,236,602,933 | 19.5% | ||
deanliu | 0 | 6,111,382,802,394 | 100% | ||
edouard | 0 | 36,309,800,257 | 15% | ||
alexpmorris | 0 | 378,629,728,932 | 100% | ||
shaka | 0 | 2,035,714,668,771 | 60% | ||
magicmonk | 0 | 7,516,176,543,399 | 100% | ||
oflyhigh | 0 | 3,947,946,624,176 | 100% | ||
hanshotfirst | 0 | 158,497,226,209 | 100% | ||
borran | 0 | 1,212,438,743,158 | 100% | ||
lemouth | 0 | 591,945,483,495 | 20% | ||
gduran | 0 | 24,651,477,425 | 100% | ||
ats-david | 0 | 3,110,146,094 | 50% | ||
jlufer | 0 | 21,722,594,618 | 100% | ||
macksby | 0 | 13,348,578,334 | 100% | ||
daveks | 0 | 2,275,502,593,889 | 85% | ||
penguinpablo | 0 | 215,313,325,481 | 14% | ||
uwelang | 0 | 808,491,791,790 | 30% | ||
cornerstone | 0 | 543,243,627,968 | 20% | ||
funnyman | 0 | 1,438,292,144 | 5.6% | ||
catotune | 0 | 9,491,586,749 | 20% | ||
clayboyn | 0 | 17,998,641,418 | 25% | ||
lydon.sipe | 0 | 5,844,096,236 | 100% | ||
techslut | 0 | 335,333,764,018 | 50% | ||
supergoodliving | 0 | 581,431,922 | 50% | ||
v4vapid | 0 | 4,427,173,336,686 | 33% | ||
darth-azrael | 0 | 31,688,432,302 | 10% | ||
michellectv | 0 | 253,127,194,580 | 100% | ||
darth-cryptic | 0 | 5,803,965,321 | 10% | ||
diggndeeper.com | 0 | 7,227,404,998,860 | 100% | ||
ganjafarmer | 0 | 118,468,523,585 | 60% | ||
egonz | 0 | 23,765,718,095 | 92% | ||
gamersclassified | 0 | 169,307,305,801 | 100% | ||
domo | 0 | 456,369,792 | 100% | ||
ampm | 0 | 10,185,901,753 | 100% | ||
newsflash | 0 | 191,678,519,245 | 8.25% | ||
xels | 0 | 313,296,284,563 | 50% | ||
isaria | 0 | 30,150,583,198 | 50% | ||
bitcoinflood | 0 | 1,721,818,334,489 | 50% | ||
alphacore | 0 | 7,402,570,121 | 7.12% | ||
enjar | 0 | 1,054,095,926,818 | 100% | ||
spectrumecons | 0 | 2,041,109,255,702 | 30% | ||
joeyarnoldvn | 0 | 531,109,109 | 1.68% | ||
vikbuddy | 0 | 38,484,660,389 | 37% | ||
papilloncharity | 0 | 3,225,763,570,045 | 85.5% | ||
sanjeevm | 0 | 2,477,842,474,786 | 50% | ||
leaky20 | 0 | 489,987,274,347 | 75% | ||
vikisecrets | 0 | 861,752,983,333 | 33% | ||
sciencevienna | 0 | 109,224,786,774 | 100% | ||
resiliencia | 0 | 3,150,449,297,537 | 100% | ||
powpow420 | 0 | 863,195,022 | 30% | ||
mochita | 0 | 43,545,082,050 | 100% | ||
faustofraser | 0 | 3,904,120,256 | 63% | ||
shanibeer | 0 | 819,954,430,346 | 35% | ||
chops316 | 0 | 200,922,911,189 | 100% | ||
santigs | 0 | 328,056,549,639 | 50% | ||
zirochka | 0 | 578,783,657,494 | 60% | ||
stoodkev | 0 | 17,855,919,636,332 | 70% | ||
jedigeiss | 0 | 2,996,325,557,449 | 100% | ||
artonmysleeve | 0 | 7,778,767,272 | 42.5% | ||
roleerob | 0 | 6,764,162,156 | 100% | ||
fatman | 0 | 9,161,684,922 | 2% | ||
mawit07 | 0 | 83,793,267,891 | 50% | ||
revisesociology | 0 | 2,732,992,797,321 | 100% | ||
isnochys | 0 | 11,671,113,528 | 5% | ||
vegoutt-travel | 0 | 30,722,462,303 | 30% | ||
dandays | 0 | 720,776,736,976 | 50% | ||
stonergirls | 0 | 471,074,163 | 60% | ||
ybanezkim26 | 0 | 33,974,572,289 | 100% | ||
fknmayhem | 0 | 7,784,517,625 | 75% | ||
steemflow | 0 | 238,761,614,631 | 72% | ||
omstavan | 0 | 8,035,555,843 | 100% | ||
travoved | 0 | 64,303,253,653 | 100% | ||
hanzappedfirst | 0 | 6,554,232,915 | 100% | ||
elderson | 0 | 2,161,261,711 | 30% | ||
rezoanulvibes | 0 | 8,458,975,664 | 100% | ||
wiseagent | 0 | 11,379,929,583 | 15% | ||
cryptonized | 0 | 236,838,676 | 14% | ||
fourfourfun | 0 | 8,717,996,110 | 25% | ||
daltono | 0 | 158,097,230,602 | 33% | ||
gabrielatravels | 0 | 173,068,829,991 | 100% | ||
josevillanueva | 0 | 8,456,026,651 | 90% | ||
leslierevales | 0 | 5,984,750,564 | 42.75% | ||
soyrosa | 0 | 354,862,883,937 | 50% | ||
newageinv | 0 | 392,326,344,693 | 25% | ||
careassaktart | 0 | 30,845,590,664 | 60% | ||
penderis | 0 | 42,767,588,833 | 50% | ||
movement19 | 0 | 1,753,787,978 | 6.25% | ||
beeyou | 0 | 7,487,450,777 | 100% | ||
lisfabian | 0 | 5,257,148,506 | 100% | ||
videoaddiction | 0 | 43,877,660,819 | 35% | ||
backinblackdevil | 0 | 43,952,107,210 | 100% | ||
blockchainyouth | 0 | 18,627,517,682 | 25% | ||
ai1love | 0 | 83,505,757 | 100% | ||
shoemanchu | 0 | 7,606,431,717 | 100% | ||
louis88 | 0 | 470,458,330,190 | 20% | ||
mrchef111 | 0 | 74,225,485,063 | 100% | ||
z3ll | 0 | 2,479,739,441 | 100% | ||
tsurmb | 0 | 76,321,338,286 | 100% | ||
cedricguillas | 0 | 279,329,079,975 | 70% | ||
break-out-trader | 0 | 37,304,539,262 | 100% | ||
mraggaj | 0 | 149,596,443,804 | 100% | ||
solominer | 0 | 5,362,524,095,458 | 25% | ||
fw206 | 0 | 231,564,622,782 | 3% | ||
slobberchops | 0 | 4,041,661,580,155 | 60% | ||
pladozero | 0 | 31,998,783,305 | 10% | ||
nateaguila | 0 | 174,922,267,837 | 12% | ||
bluewall | 0 | 30,356,510,514 | 100% | ||
digital.mine | 0 | 104,233,251,456 | 100% | ||
greenunion | 0 | 903,085,880 | 60% | ||
czera | 0 | 619,982,829 | 100% | ||
mrnightmare89 | 0 | 6,386,814,180 | 20% | ||
smartvote | 0 | 111,337,930,767 | 4.59% | ||
vixmemon | 0 | 19,738,593,403 | 50% | ||
harkar | 0 | 213,686,508,901 | 20% | ||
idakarlsen | 0 | 43,583,510,154 | 10% | ||
teampdx | 0 | 467,734,274 | 60% | ||
teamoregon | 0 | 667,650,134 | 60% | ||
voter001 | 0 | 54,538,494,660 | 58.8% | ||
ganjafarmers | 0 | 532,928,692 | 60% | ||
stefano.massari | 0 | 108,797,937,728 | 48% | ||
kind.network | 0 | 2,264,176,635 | 60% | ||
thrasher666 | 0 | 2,569,090,012 | 60% | ||
vasigo | 0 | 21,397,901,969 | 100% | ||
inpursuit | 0 | 5,977,338,899 | 39.45% | ||
bro-poker | 0 | 687,679,245 | 50% | ||
gudnius.comics | 0 | 11,513,516,421 | 100% | ||
smokingfit | 0 | 791,066,935 | 50% | ||
nutfund | 0 | 4,342,177,784 | 100% | ||
starrouge | 0 | 1,001,238,721 | 50% | ||
retrodroid | 0 | 3,299,259,682 | 10% | ||
wherein | 0 | 24,083,091,202 | 100% | ||
bluerobo | 0 | 335,191,871,180 | 100% | ||
zerofive | 0 | 872,902,987 | 50% | ||
zydane | 0 | 42,744,447,862 | 100% | ||
jacuzzi | 0 | 575,061,999 | 1.4% | ||
blind-spot | 0 | 9,256,595,224 | 50% | ||
primeradue | 0 | 491,080,196 | 33% | ||
lestrange | 0 | 10,911,972,888 | 100% | ||
samantha-w | 0 | 2,641,538,367,332 | 60% | ||
cnstm | 0 | 114,636,611,113 | 100% | ||
likuang007 | 0 | 648,414,847 | 100% | ||
lianjingmedia | 0 | 968,296,403 | 100% | ||
creacioneslelys | 0 | 25,353,748,616 | 100% | ||
leeyh2 | 0 | 20,432,825,791 | 100% | ||
hungrybear | 0 | 595,617,338 | 14% | ||
pursuant | 0 | 5,967,404,082 | 38.93% | ||
doze | 0 | 1,119,215,117 | 50% | ||
steemmonsterking | 0 | 1,311,957,147 | 12.5% | ||
vaporrhino | 0 | 1,247,848,668 | 60% | ||
sophieandhenrik | 0 | 6,550,841,200 | 30% | ||
russia-btc | 0 | 131,430,185,535 | 30% | ||
wulff-media | 0 | 43,965,351,984 | 100% | ||
agmoore2 | 0 | 14,201,572,114 | 100% | ||
bearjohn | 0 | 1,859,439,501 | 75% | ||
mktmaker | 0 | 702,410,735 | 72.75% | ||
whangster79 | 0 | 6,368,255,187 | 25% | ||
bilpcoinbot1 | 0 | 0 | 100% | ||
urun | 0 | 25,441,983,801 | 100% | ||
brocfml | 0 | 1,633,733,450 | 100% | ||
therealyme | 0 | 1,116,944,290,653 | 15% | ||
tobago | 0 | 630,942,606 | 35% | ||
sidekicker2 | 0 | 537,665,626 | 13% | ||
stoodmonsters | 0 | 31,830,034,815 | 70% | ||
davidlionfish | 0 | 13,743,283,064 | 50% | ||
zeusflatsak | 0 | 9,388,072,663 | 60% | ||
shinoxl | 0 | 12,590,834,057 | 100% | ||
tht | 0 | 13,728,578,540 | 100% | ||
penned-bullshit | 0 | 459,109,122 | 50% | ||
captainhive | 0 | 754,121,326,127 | 30% | ||
tht1 | 0 | 3,002,261,512 | 100% | ||
ninnu | 0 | 568,769,391 | 50% | ||
hive-169313 | 0 | 2,265,175,180 | 100% | ||
logicforce | 0 | 2,858,602,445 | 50% | ||
plusvault | 0 | 875,414,264 | 25% | ||
koxmicart | 0 | 271,490,229 | 100% | ||
recoveryinc | 0 | 8,213,171,585 | 12.5% | ||
hive-data | 0 | 433,913,503 | 10% | ||
martial.media | 0 | 3,241,827,247 | 60% | ||
liz.writes | 0 | 616,035,740 | 30% | ||
dying | 0 | 915,006,179 | 25% | ||
gallatin | 0 | 29,555,926,198 | 100% | ||
momins | 0 | 5,427,400,463 | 100% | ||
samrisso | 0 | 8,840,740,761 | 12.5% | ||
kriszrokk | 0 | 37,563,053,752 | 100% | ||
yieldgrower | 0 | 12,663,851,315 | 100% | ||
haitch | 0 | 3,303,365,744 | 100% | ||
tomtothetom | 0 | 3,674,858,969 | 25% | ||
biglove | 0 | 1,247,617,454 | 25% | ||
wend1go | 0 | 16,780,137,181 | 100% | ||
drricksanchez | 0 | 42,978,849,699 | 7.5% | ||
zarnoex | 0 | 514,972,835 | 100% | ||
unlockmaster | 0 | 10,589,206,176 | 100% | ||
reidenling90 | 0 | 5,370,512,993 | 100% | ||
zwhammer | 0 | 1,167,882,207 | 50% | ||
yeouido.park | 0 | 2,168,537,533,549 | 100% | ||
farpetrad | 0 | 95,913,719,135 | 100% | ||
bulldog1205 | 0 | 727,311,788 | 25% | ||
cristanza42 | 0 | 5,864,765,929 | 100% | ||
borsengelaber | 0 | 132,859,917,988 | 50% | ||
celeste413 | 0 | 39,952,764,682 | 100% | ||
aloysiusmbaba | 0 | 1,524,875,055 | 100% | ||
olympicdragon | 0 | 893,778,476 | 100% | ||
alpha-omega | 0 | 156,013,805,307 | 100% | ||
torran | 0 | 19,689,874,822 | 80% | ||
aequi | 0 | 44,826,100,547 | 70% | ||
speko | 0 | 18,434,485,105 | 25% | ||
aurzeq | 0 | 75,386,619,099 | 100% | ||
acantoni | 0 | 3,227,688,237 | 12.5% | ||
davideownzall | 0 | 633,335,590 | 100% | ||
kungfukid | 0 | 189,255,210,942 | 100% | ||
njker | 0 | 600,705,503 | 25% | ||
rocket47 | 0 | 7,229,361,451 | 100% | ||
marsupia | 0 | 1,867,473,576 | 50% | ||
mimi.ruby | 0 | 143,400,187,840 | 70% | ||
iproto | 0 | 19,412,140,482 | 50% | ||
hylene74 | 0 | 72,264,785,497 | 100% | ||
aftersound | 0 | 297,364,619,654 | 100% | ||
gaposchkin | 0 | 10,047,514,992 | 100% | ||
almajandra | 0 | 12,673,518,689 | 100% | ||
princekeys | 0 | 5,539,040,685 | 100% | ||
ssebasv | 0 | 1,932,593,795 | 100% | ||
willkomo | 0 | 749,797,099 | 100% | ||
herman-german | 0 | 5,217,003,098 | 50% | ||
mfontom | 0 | 6,940,905,028 | 100% | ||
hoffmeister84 | 0 | 10,710,950,729 | 70% | ||
condigital | 0 | 0 | 13% | ||
templar.pool | 0 | 54,553,252,040 | 100% | ||
woodathegsd | 0 | 6,959,972,725 | 60% | ||
zerofucks | 0 | 4,520,403,721 | 100% | ||
santacruz.sports | 0 | 2,262,076,619 | 100% | ||
zuun.net | 0 | 596,215,224 | 13% | ||
theindiankid | 0 | 25,224,338,904 | 100% | ||
yelimarin | 0 | 30,456,218,256 | 100% | ||
littlebee4 | 0 | 50,668,410,495 | 13% | ||
islandboi | 0 | 833,970,199 | 100% | ||
janitzearratia | 0 | 32,046,210,724 | 50% | ||
surrealis | 0 | 60,401,536,579 | 100% | ||
janetedita | 0 | 133,640,552,434 | 100% | ||
katherine-w | 0 | 2,203,760,981,610 | 60% | ||
h4rr1s | 0 | 5,904,357,653 | 100% | ||
boeltermc | 0 | 7,596,538,068 | 60% | ||
sydechan | 0 | 1,126,900,070,427 | 100% | ||
cimmeron | 0 | 3,016,436,625 | 12.5% | ||
caelum1infernum | 0 | 1,866,581,897 | 12% | ||
odessamama | 0 | 2,910,769,959 | 60% | ||
franzpaulie | 0 | 44,479,998,611 | 100% | ||
panosdada.tip | 0 | 1,467,139,046 | 100% | ||
wolfplayzor | 0 | 2,132,653,790 | 100% | ||
positivum | 0 | 791,182,693 | 100% | ||
bitterirony | 0 | 34,351,888,715 | 100% | ||
llunasoul | 0 | 621,910,702 | 1.11% | ||
growandbow | 0 | 12,794,008,934 | 1.11% | ||
tzae | 0 | 715,009,751 | 100% | ||
the13anarchist | 0 | 1,040,193,053 | 8.25% | ||
vickvan | 0 | 515,007,050 | 30% | ||
revise.spk | 0 | 763,638,099 | 100% | ||
labyrinths | 0 | 12,760,019,377 | 100% | ||
queercoin | 0 | 83,505,986,897 | 50% | ||
pocket-rents | 0 | 1,051,221,204 | 25% | ||
psyskiff | 0 | 384,112,344 | 20% | ||
hive-195880 | 0 | 1,036,014,266 | 25% | ||
eosdios | 0 | 397,231,600 | 20% | ||
ibbtammy | 0 | 160,375,281,771 | 100% | ||
ganjafrmer | 0 | 1,180,044,943 | 60% | ||
liquidocelotytt | 0 | 466,167,541 | 100% | ||
orkangel | 0 | 673,017,485 | 15% | ||
timix648 | 0 | 527,433,124 | 26.4% | ||
strega.azure | 0 | 77,012,011,059 | 100% | ||
literal | 0 | 87,434,069,693 | 100% | ||
chris-chris92 | 0 | 7,466,471,140 | 100% | ||
steve.and.anke | 0 | 12,079,593,535 | 100% | ||
young-tari | 0 | 14,276,409,170 | 100% | ||
ifhy | 0 | 522,345,677 | 17.5% | ||
hivecuba.p2p | 0 | 56,945,150,120 | 100% | ||
tamiem | 0 | 11,981,997,285 | 100% | ||
neoxianvoter | 0 | 0 | 13% | ||
tecnotronics | 0 | 13,409,624,307 | 100% | ||
picazzy005 | 0 | 1,037,264,195 | 35% | ||
lilkista | 0 | 6,690,307,090 | 100% | ||
mama21 | 0 | 5,699,494,695 | 100% | ||
lilkistb | 0 | 5,076,756,984 | 100% | ||
bijoykhan2005 | 0 | 380,842,106,656 | 100% | ||
cur01 | 0 | 8,702,387,864 | 100% | ||
adrian37 | 0 | 1,738,285,307 | 100% | ||
thesegunvictor | 0 | 2,378,648,639 | 100% |
If PDFplumber saves time and produces more effective results when extracting data from PDFs then it's the way to go. Automating repetitive tasks sounds like a fine idea. Maybe I'll try PDFplumber when I have such PDF work to do. Thanks for this useful info. Have a great day.
author | aloysiusmbaba |
---|---|
permlink | re-geekgirl-20241212t215836515z |
category | python |
json_metadata | {"type":"comment","tags":["python","programming","pdfplumber","coding","proofofbrain"],"app":"ecency/3.2.0-mobile","format":"markdown+html"} |
created | 2024-12-12 20:58:39 |
last_update | 2024-12-12 20:58:39 |
depth | 1 |
children | 0 |
last_payout | 2024-12-19 20:58:39 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.082 HBD |
curator_payout_value | 0.083 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 275 |
author_reputation | 26,788,448,453,917 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,163,648 |
net_rshares | 457,989,287,164 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 457,989,287,164 | 10% |
Wow I never knew PDFs could be edited Iβve tried it but didnβt work for me
author | bisolamih |
---|---|
permlink | re-geekgirl-20241213t17543149z |
category | python |
json_metadata | {"type":"comment","tags":["python","programming","pdfplumber","coding","proofofbrain"],"app":"ecency/3.1.0-mobile","format":"markdown+html"} |
created | 2024-12-13 16:05:42 |
last_update | 2024-12-13 16:05:42 |
depth | 1 |
children | 0 |
last_payout | 2024-12-20 16:05:42 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.076 HBD |
curator_payout_value | 0.076 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 75 |
author_reputation | 72,098,015,457,247 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,181,550 |
net_rshares | 451,663,872,862 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 451,663,872,862 | 10% |
Interesting. I will have to create a script to extract data from videos. Kinda pushing it off :(
author | bluerobo |
---|---|
permlink | re-geekgirl-soegeh |
category | python |
json_metadata | {"tags":["python"],"app":"peakd/2024.11.3","image":[],"users":[]} |
created | 2024-12-12 21:15:54 |
last_update | 2024-12-12 21:15:54 |
depth | 1 |
children | 4 |
last_payout | 2024-12-19 21:15:54 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.082 HBD |
curator_payout_value | 0.082 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 96 |
author_reputation | 100,998,498,432,992 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,164,199 |
net_rshares | 457,076,712,938 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 457,076,712,938 | 10% |
<div class='pull-right'>https://files.peakd.com/file/peakd-hive/beerlover/yiuU6bdf-beerlover20gives20BEER.gif<p><sup><a href='https://hive-engine.com/?p=market&t=BEER'>View or trade </a> <code>BEER</code>.</sup></p></div><center><br> <p>Hey @bluerobo, here is a little bit of <code>BEER</code> from @isnochys for you. Enjoy it!</p> <p>Learn how to <a href='https://peakd.com/beer/@beerlover/what-is-proof-of-stake-with-beer'>earn <b>FREE BEER</b> each day </a> by staking your <code>BEER</code>.</p> </center><div></div>
author | beerlover |
---|---|
permlink | re-bluerobo-re-geekgirl-soegeh-20241212t220348880z |
category | python |
json_metadata | {"app":"beerlover/3.0","language":"rust","developer":"wehmoen"} |
created | 2024-12-12 22:03:48 |
last_update | 2024-12-12 22:03:48 |
depth | 2 |
children | 0 |
last_payout | 2024-12-19 22:03:48 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 521 |
author_reputation | 25,761,508,188,824 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,164,991 |
net_rshares | 0 |
<div class='pull-right'>https://files.peakd.com/file/peakd-hive/beerlover/yiuU6bdf-beerlover20gives20BEER.gif<p><sup><a href='https://hive-engine.com/?p=market&t=BEER'>View or trade </a> <code>BEER</code>.</sup></p></div><center><br> <p>Hey @bluerobo, here is a little bit of <code>BEER</code> from @isnochys for you. Enjoy it!</p> <p>Do you want to <a href='https://friends.beersaturday.com/'>win <b>SOME BEER</b> together with your friends </a> and draw the <code>BEERKING</code>.</p> </center><div></div>
author | beerlover |
---|---|
permlink | re-bluerobo-re-geekgirl-soegeh-20241213t221159741z |
category | python |
json_metadata | {"app":"beerlover/3.0","language":"rust","developer":"wehmoen"} |
created | 2024-12-13 22:12:00 |
last_update | 2024-12-13 22:12:00 |
depth | 2 |
children | 0 |
last_payout | 2024-12-20 22:12:00 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 508 |
author_reputation | 25,761,508,188,824 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,190,805 |
net_rshares | 0 |
Thank you for your [witness vote](https://hivesigner.com/sign/account-witness-vote?witness=isnochys&approve=1)! Have a !BEER on me! To Opt-Out of my witness beer program just comment STOP below
author | isnochys |
---|---|
permlink | re-re-geekgirl-soegeh-20241212t220336z |
category | python |
json_metadata | "{"app": "beem/0.24.26"}" |
created | 2024-12-12 22:03:39 |
last_update | 2024-12-12 22:03:39 |
depth | 2 |
children | 0 |
last_payout | 2024-12-19 22:03:39 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 194 |
author_reputation | 48,490,072,901,013 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,164,988 |
net_rshares | -5,242,189,753 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
spaminator | 0 | -5,242,189,753 | -0.25% |
Thank you for your [witness vote](https://hivesigner.com/sign/account-witness-vote?witness=isnochys&approve=1)! Have a !BEER on me! To Opt-Out of my witness beer program just comment STOP below
author | isnochys |
---|---|
permlink | re-re-geekgirl-soegeh-20241213t221144z |
category | python |
json_metadata | "{"app": "beem/0.24.26"}" |
created | 2024-12-13 22:11:45 |
last_update | 2024-12-13 22:11:45 |
depth | 2 |
children | 0 |
last_payout | 2024-12-20 22:11:45 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 194 |
author_reputation | 48,490,072,901,013 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,190,797 |
net_rshares | -5,298,451,061 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
spaminator | 0 | -5,298,451,061 | -0.25% |
i only do java, but phyton seems very dynamic and pretty modern language, even stable diffusion runs on phyton and you show this which is totally different...very adaptive
author | davideownzall |
---|---|
permlink | re-geekgirl-soeebw |
category | python |
json_metadata | {"tags":["python"],"app":"peakd/2024.11.3","image":[],"users":[]} |
created | 2024-12-12 20:31:09 |
last_update | 2024-12-12 20:31:09 |
depth | 1 |
children | 0 |
last_payout | 2024-12-19 20:31:09 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.080 HBD |
curator_payout_value | 0.080 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 171 |
author_reputation | 98,177,933,727,290 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,163,117 |
net_rshares | 446,290,868,950 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 450,763,949,465 | 10% | ||
spaminator | 0 | -5,217,519,290 | -0.25% | ||
endhivewatchers | 0 | 744,438,775 | 5% |
Ever tried switching formats and felt like decoding ancient scrolls? CoolUtils makes it feel like flipping pancakes. Their Online PDF Converter processes over 1,400 file types with easeβDOC, XLS, HTML, TIFF, and moreβright from your browser, no installs. Itβs fast, free for smaller tasks, and handles batch conversions like a pro. Upload from your desktop, tweak settings, and get instant results. The tool preserves layout, fonts, and images flawlessly π https://www.coolutils.com/online/PDF-Converter/ whether you're prepping a report or archiving old contracts. Even better, the site supports cloud integrations like Google Drive and Dropbox, making file transfers smooth. When efficiency matters and simplicity counts, this tool delivers. Every time. Trust itβitβs built for speed and reliability.
author | jbishopsky |
---|---|
permlink | re-geekgirl-sww1as |
category | python |
json_metadata | {"tags":["python"],"app":"peakd/2025.5.7","image":[],"users":[]} |
created | 2025-05-26 21:35:18 |
last_update | 2025-05-26 21:35:18 |
depth | 1 |
children | 0 |
last_payout | 2025-06-02 21:35:18 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 802 |
author_reputation | -74,137,584,032 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 142,982,637 |
net_rshares | 0 |
is this free? i only use pdf converter (pdf to word, excel).
author | kungfukid |
---|---|
permlink | re-geekgirl-soeo4s |
category | python |
json_metadata | {"tags":["python"],"app":"peakd/2024.11.3","image":[],"users":[]} |
created | 2024-12-13 00:02:57 |
last_update | 2024-12-13 00:02:57 |
depth | 1 |
children | 0 |
last_payout | 2024-12-20 00:02:57 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.082 HBD |
curator_payout_value | 0.081 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 61 |
author_reputation | 16,570,486,477,740 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,167,423 |
net_rshares | 456,916,989,328 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 456,172,972,559 | 10% | ||
endhivewatchers | 0 | 744,016,769 | 5% |
I have some religious manuscripts I have written over the years (and I'm still writing more) maybe I should trying them with this python and see how it works for me. Thank you for sharing.
author | mfontom |
---|---|
permlink | re-geekgirl-20241213t15150101z |
category | python |
json_metadata | {"type":"comment","tags":["python","programming","pdfplumber","coding","proofofbrain"],"app":"ecency/3.2.0-mobile","format":"markdown+html"} |
created | 2024-12-13 14:15:06 |
last_update | 2024-12-13 14:15:06 |
depth | 1 |
children | 0 |
last_payout | 2024-12-20 14:15:06 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.078 HBD |
curator_payout_value | 0.079 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 188 |
author_reputation | 60,670,378,784,827 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,179,353 |
net_rshares | 452,562,088,341 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 452,562,088,341 | 10% |
I don't know anything about coding, but I realized that this can be done using the Python coding language.
author | momins | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
permlink | sofdgx | ||||||||||||
category | python | ||||||||||||
json_metadata | {"app":"hiveblog/0.1"} | ||||||||||||
created | 2024-12-13 09:10:18 | ||||||||||||
last_update | 2024-12-13 09:10:18 | ||||||||||||
depth | 1 | ||||||||||||
children | 0 | ||||||||||||
last_payout | 2024-12-20 09:10:18 | ||||||||||||
cashout_time | 1969-12-31 23:59:59 | ||||||||||||
total_payout_value | 0.077 HBD | ||||||||||||
curator_payout_value | 0.078 HBD | ||||||||||||
pending_payout_value | 0.000 HBD | ||||||||||||
promoted | 0.000 HBD | ||||||||||||
body_length | 106 | ||||||||||||
author_reputation | 56,663,015,570,290 | ||||||||||||
root_title | "Visual Debugging PDF documents With PDFPlumber" | ||||||||||||
beneficiaries |
| ||||||||||||
max_accepted_payout | 1,000,000.000 HBD | ||||||||||||
percent_hbd | 10,000 | ||||||||||||
post_id | 139,174,600 | ||||||||||||
net_rshares | 453,458,592,479 | ||||||||||||
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 453,458,592,479 | 10% |
I know how to use pdf. But this is too complicated although looks useful !
author | olympicdragon |
---|---|
permlink | re-geekgirl-20241213t121627225z |
category | python |
json_metadata | {"type":"comment","tags":["python","programming","pdfplumber","coding","proofofbrain"],"app":"ecency/3.2.0-mobile","format":"markdown+html"} |
created | 2024-12-13 04:16:27 |
last_update | 2024-12-13 04:16:27 |
depth | 1 |
children | 0 |
last_payout | 2024-12-20 04:16:27 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.080 HBD |
curator_payout_value | 0.081 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 74 |
author_reputation | 34,659,922,596,582 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,171,380 |
net_rshares | 455,267,535,312 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 455,267,535,312 | 10% |
Wow this is so amazing π I started self learning Python towards what I want to do in school which is Artificial Intelligence, but I just feel really stuck right now. It's probably because I learned the wrong way lol π But hopefully I'll get on track back
author | princekeys |
---|---|
permlink | re-geekgirl-20241216t9141979z |
category | python |
json_metadata | {"type":"comment","tags":["python","programming","pdfplumber","coding","proofofbrain"],"app":"ecency/3.2.0-mobile","format":"markdown+html"} |
created | 2024-12-16 08:01:48 |
last_update | 2024-12-16 08:01:48 |
depth | 1 |
children | 0 |
last_payout | 2024-12-23 08:01:48 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 256 |
author_reputation | 24,170,125,909,965 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,247,445 |
net_rshares | 0 |
!pimp
author | silverd510 |
---|---|
permlink | soef85 |
category | python |
json_metadata | {"app":"hiveblog/0.1"} |
created | 2024-12-12 20:50:27 |
last_update | 2024-12-12 20:50:27 |
depth | 1 |
children | 0 |
last_payout | 2024-12-19 20:50:27 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 5 |
author_reputation | 902,209,505,944,305 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,163,444 |
net_rshares | 0 |
Thank you, this is quite informative.
author | thesegunvictor |
---|---|
permlink | re-geekgirl-20241212t21029221z |
category | python |
json_metadata | {"type":"comment","tags":["python","programming","pdfplumber","coding","proofofbrain"],"app":"ecency/3.2.0-mobile","format":"markdown+html"} |
created | 2024-12-12 20:00:30 |
last_update | 2024-12-12 20:00:30 |
depth | 1 |
children | 0 |
last_payout | 2024-12-19 20:00:30 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.084 HBD |
curator_payout_value | 0.084 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 38 |
author_reputation | 272,755,948,850 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,162,642 |
net_rshares | 459,809,377,509 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 459,809,377,509 | 10% |
I didn't know about pdflumber. It seems pretty impressive in terms of time and efficiency. I've used pdfbinder before, which merges PDF files.
author | tht |
---|---|
permlink | soefey |
category | python |
json_metadata | {"app":"hiveblog/0.1"} |
created | 2024-12-12 20:54:36 |
last_update | 2024-12-12 20:54:36 |
depth | 1 |
children | 0 |
last_payout | 2024-12-19 20:54:36 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.082 HBD |
curator_payout_value | 0.083 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 142 |
author_reputation | 107,808,691,511,071 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,163,575 |
net_rshares | 458,896,667,452 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 458,896,667,452 | 10% |
Yuhh it looks really like an universal tool. Wish I had such library while I was writing my graduation work in university years ago it could save my nerves and time a lot...
author | travoved |
---|---|
permlink | re-geekgirl-20241212t225032941z |
category | python |
json_metadata | {"type":"comment","tags":["python","programming","pdfplumber","coding","proofofbrain"],"app":"ecency/3.2.0-mobile","format":"markdown+html"} |
created | 2024-12-12 19:50:33 |
last_update | 2024-12-12 19:50:33 |
depth | 1 |
children | 0 |
last_payout | 2024-12-19 19:50:33 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.084 HBD |
curator_payout_value | 0.084 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 174 |
author_reputation | 48,422,674,105,375 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,162,467 |
net_rshares | 461,643,806,932 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 461,643,806,932 | 10% |
Even copying data from PDF and pasting it on a word is a mess.
author | videoaddiction |
---|---|
permlink | re-geekgirl-20241212t23018379z |
category | python |
json_metadata | {"type":"comment","tags":["python","programming","pdfplumber","coding","proofofbrain"],"app":"ecency/3.2.0-mobile","format":"markdown+html"} |
created | 2024-12-12 20:00:18 |
last_update | 2024-12-12 20:00:18 |
depth | 1 |
children | 0 |
last_payout | 2024-12-19 20:00:18 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.084 HBD |
curator_payout_value | 0.084 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 63 |
author_reputation | 165,358,163,084,494 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,162,630 |
net_rshares | 460,727,428,016 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 460,727,428,016 | 10% |
Working with pdf editing is always annoying, thats a good tool!
author | wolfplayzor |
---|---|
permlink | re-geekgirl-sofclx |
category | python |
json_metadata | {"tags":["python"],"app":"peakd/2024.11.3","image":[],"users":[]} |
created | 2024-12-13 08:51:33 |
last_update | 2024-12-13 08:51:33 |
depth | 1 |
children | 0 |
last_payout | 2024-12-20 08:51:33 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.078 HBD |
curator_payout_value | 0.078 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 63 |
author_reputation | 45,664,127,435,858 |
root_title | "Visual Debugging PDF documents With PDFPlumber" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 139,174,350 |
net_rshares | 454,360,404,590 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
geekgirl | 0 | 454,360,404,590 | 10% |