I can't translate html tags that contain other tags (such as <a href=..</a> OR <em></em>)
In example below, the paragraph <p class JAGAAA>..</p> is the problem, I cannot translate. All other p classes are translated very well. Except this class, because it has in it those <a href=..</a> OR <em></em>
I try so many things. I don't know why is not working my code. I don;t get any error. Just, this class is not translated.
In example below, the paragraph <p class JAGAAA>..</p> is the problem, I cannot translate. All other p classes are translated very well. Except this class, because it has in it those <a href=..</a> OR <em></em>
I try so many things. I don't know why is not working my code. I don;t get any error. Just, this class is not translated.
<p class="JAGAAA">Intr-un articol precedent, <a href="https://neculaifantanaru.com/dupa-toate-regulile-artei.html"> <em>Dupa toate regulile artei</em> </a>, v-am povestit despre tanarul Hamlet, care voia sa razbune moartea tatalui sau</p>.**THIS IS THE PART OF THE CODE**
import os
from bs4 import BeautifulSoup, NavigableString
import re
import textwrap
from googletrans import Translator
import pprint
...
with open(f"{base_path}/{file}" , "r" , encoding='utf8', errors='ignore') as open_file:
data = open_file.read()
if data == "":
print("{} este gol".format(file))
continue
lxml1 = str(BeautifulSoup(data, 'lxml'))
#lxml1 = data
lxml1 = lxml1.replace("\ufeff" , " ")
#lxml1 = lxml1.replace("\n" , " ")
#lxml1 = re.sub(' +', ' ', lxml1)
if(read_tags == True):
soup = BeautifulSoup(data, 'lxml')
title_tag = soup.find("title")
ist_p_tag = soup.find("p" , class_="text_obisnuit2")
ist3_p_tag = soup.find("p" , class_="JAGAAA")
second_p_tag = soup.find("p" , class_="donoo")
meta_tag = soup.find("meta")
if(title_tag == None):
print("Title tag does not found")
else:
translated_title = translator.translate(title_tag.text, dest=input_lang)
lxml1 = lxml1.replace(title_tag.text,translated_title.text)
if(meta_tag == None):
print("meta tag does not found")
else:
translated_meta = translator.translate(meta_tag["content"], dest=input_lang)
lxml1 = lxml1.replace(meta_tag["content"],translated_meta.text)
if(ist_p_tag == None):
print("<p class='text_obisnuit2' /> not found")
else:
translated_p = translator.translate(ist_p_tag.text, dest=input_lang)
lxml1 = lxml1.replace(ist_p_tag.text,translated_p.text)
if(ist3_p_tag == None):
print("<p class='JAGAAA' /> not found")
else:
translated_p = translator.translate(ist3_p_tag.text, dest=input_lang)
lxml1 = lxml1.replace(ist3_p_tag.text,translated_p.text)
