Releases: Unstructured-IO/unstructured
Releases · Unstructured-IO/unstructured
0.23.1
What's Changed
- feat: extract filled AcroForm field text in PDF partitioning by @badGarnet in #4372
Full Changelog: 0.23.0...0.23.1
0.23.0
What's Changed
- fix: stop decimating embedded text on dense PDF pages by @badGarnet in #4368
- fix: keep extracted text aligned with rotated PDF page images in hi_res by @badGarnet in #4367
- feat: add enrichment origins metadata field by @badGarnet in #4370
Full Changelog: 0.22.32...0.23.0
0.22.32
0.22.31
What's Changed
- fix: rename
isolate_tableschunking option toisolate_tableby @badGarnet in #4355
Full Changelog: 0.22.30...0.22.31
0.22.30
What's Changed
- feat: add option for table chunking by @badGarnet in #4354
Full Changelog: 0.22.29...0.22.30
0.22.29
What's Changed
- fix: handle text too long for spacy issue by @badGarnet in #4353
Full Changelog: 0.22.28...0.22.29
0.22.28
What's Changed
- fix: chunking dropping table content by @badGarnet in #4352
Full Changelog: 0.22.27...0.22.28
0.22.27
What's Changed
- fix: ndjson file type detection by @badGarnet in #4349
Full Changelog: 0.22.26...0.22.27
0.22.6
What's Changed
- Reject oversized PDF renders before bitmap allocation by @CyMule in #4345
- feat(cli): add unstructured doctor diagnostics command by @claytonlin1110 in #4342
- feat: Track the table extraction method by @vladimir-kivi-ds in #4346
Full Changelog: 0.22.23...0.22.26
0.22.23
What's Changed
- fix: first table chunk preserve col/row span by @badGarnet in #4343
Full Changelog: 0.22.22...0.22.23