Python Forum
Apply textual data cleaning to several CSV files
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Apply textual data cleaning to several CSV files
#1
I need to perform a textual analysis that includes several speeches. The speeches were transcribed (using OCR) from several PDFs files into CSVs files. Each CSV file contains a column titled speech, with several speeches from different speakers (one speaker, one row). I wrote a function to "clean" a little the most common shortfalls of the OCR. I applied this function to a single files and it does the job. Therefore, I am now trying to apply this function to all CSVs files. However, I keep getting the error "TypeError: expected string or bytes-like object". However, when I apply the code to a single file it does work, so I am stuck...Can someone help me? Any suggestion is appreciated.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Data cleaning and analysys in Python siriuslight 5 1,127 Jun-13-2026, 08:27 AM
Last Post: noisefloor
  [SOLVED] Why does regex fail cleaning line? Winfried 7 4,936 Jul-11-2025, 11:52 PM
Last Post: Pedroski55
  Cleaning my code to make it more efficient BSDevo 14 5,634 Jul-11-2025, 07:20 AM
Last Post: FrankBuckland
  Data cleaning help ClimbAddict 2 3,450 Jul-11-2025, 07:18 AM
Last Post: FrankBuckland
  Is it possible to extract 1 or 2 bits of data from MS project files? cubangt 8 6,463 Feb-16-2024, 12:02 AM
Last Post: deanhystad
  script to calculate data in csv-files ledgreve 0 3,385 May-19-2023, 07:24 AM
Last Post: ledgreve
  How to apply function lower() to the list? Toltimtixma 2 1,940 Feb-10-2023, 05:15 PM
Last Post: Toltimtixma
  SQL Alchemy help to extract sql data into csv files mg24 1 4,174 Sep-30-2022, 04:43 PM
Last Post: Larz60+
  Including data files in a package ChrisOfBristol 4 6,988 Oct-27-2021, 04:14 PM
Last Post: ChrisOfBristol
  Plotting sum of data files using simple code Laplace12 3 5,495 Jun-16-2021, 02:06 PM
Last Post: BashBedlam

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020