Skip to main content

Posts

Showing posts from November, 2022

Pyspark code with example

  How to Import PySpark Pandas? import pyspark.pandas as ps # Import pandas import pandas as pd How to Create pandas DataFrame?? technologies   = ({     'Courses':["Spark","PySpark","Hadoop","Python","Pandas","Hadoop","Spark","Python","NA"],     'Fee' :[22000,25000,23000,24000,26000,25000,25000,22000,1500],     'Duration':['30days','50days','55days','40days','60days','35days','30days','50days','40days'],     'Discount':[1000,2300,1000,1200,2500,None,1400,1600,0]           }) df = pd.DataFrame(technologies) print(df) Run Pandas API DataFrame on PySpark (Spark with Python) Use the above created pandas DataFrame and run it on PySpark. In order to do so, you need to use  import pyspark.pandas as ps  instead of  import pandas as pd . And use  ps.DataFrame()  to create a DataFrame. # Import pyspark.p...