Apr-03-2025, 12:02 PM
Team,
We have three columns in a table (Mobile Number, Home Phone Number, Business Phone Number).
I am writing a Python script to identify the Valid and Invalid contact Records.
1) If Any of the Phone Number is Valid US Number - we have to mark that record as Valid Number.
2) If all the three Phone Numbers are Invalid, then we have to mark that record as Invalid contact.
I have used this code - but the code always shows Incorrect Number even one of the column has a valid number. Can someone help me on this. Thank you!
We have three columns in a table (Mobile Number, Home Phone Number, Business Phone Number).
I am writing a Python script to identify the Valid and Invalid contact Records.
1) If Any of the Phone Number is Valid US Number - we have to mark that record as Valid Number.
2) If all the three Phone Numbers are Invalid, then we have to mark that record as Invalid contact.
I have used this code - but the code always shows Incorrect Number even one of the column has a valid number. Can someone help me on this. Thank you!
import phonenumbers
from pyspark.sql.functions import udf
from pyspark.sql.types import StringType
# Define a UDF to parse and validate US phone numbers
def parse_phone_number(mobile_phone_number, home_phone_number, busn_phone_number):
try:
is_valid_mobile = False
is_valid_home = False
is_valid_busn = False
if mobile_phone_number:
parsed_mobile_number = phonenumbers.parse(mobile_phone_number, "US")
if phonenumbers.is_valid_number(parsed_mobile_number):
is_valid_mobile = True
if home_phone_number:
parsed_home_number = phonenumbers.parse(home_phone_number, "US")
if phonenumbers.is_valid_number(parsed_home_number):
is_valid_home = True
if busn_phone_number:
parsed_busn_number = phonenumbers.parse(busn_phone_number, "US")
if phonenumbers.is_valid_number(parsed_busn_number):
is_valid_busn = True
if is_valid_mobile or is_valid_home or is_valid_busn:
return "Correct Number"
else:
return "Incorrect Number"
except Exception as e:
return f"Error parsing phone number: {e}"
parse_phone_number_udf = udf(parse_phone_number, StringType())
Query = """ select * from datalabs.lab_tech_bi.KYC_dash_analysis where BUSN_PH_NBR = '5852330889' """
df = spark.sql(Query)
df = df.withColumn('Correct_number', parse_phone_number_udf(df['MOBILE_PH_NBR'], df['HOME_PH_NBR'], df['BUSN_PH_NBR']))
# Create a temporary view from the DataFrame
df.createOrReplaceTempView("final_output")
# Use SQL to create a table from the temporary view
spark.sql("""
CREATE OR REPLACE TABLE datalabs.lab_tech_bi.final_output_table AS
SELECT * FROM final_output
""")
display(df)
