in Education by
I am utilizing the following to find missing values in my spark df: from pyspark.sql.functions import col,sum df.select(*(sum(col(c).isNull().cast("int")).alias(c) for c in df.columns)).show() from my sample spark df below: import numpy as np from pyspark.sql import SparkSession spark = SparkSession.builder.appName('SparkByExamples.com').getOrCreate() data = [ ("James","CA",np.NaN), ("Julia","",None), ("Ram",None,200.0), ("Ramya","NULL",np.NAN) ] df =spark.createDataFrame(data,["name","state","number"]) df.show() How can I convert result of the prior missing count lines into a pandas dataframe? My real df has 26 columns and showing it in a spark df is messy and misaligned. JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by
This might not be as clean as the actual pandas df with a table, but hopefully this would work for you: From your first code, remove the .show() call: df.select(*(sum(col(c).isNull().cast("int")).alias(c) for c in df.columns)) You can assign a variable for that line or go straight with toPandas() call sdf = df.select(*(sum(col(c).isNull().cast("int")).alias(c) for c in df.columns)) new_df = sdf.toPandas().T print(new_df) The .T call is to transpose the dataframe. If you have several columns, without transposing it will truncate the columns and you will not be able to see all columns. Again, this does not have the actual table, but at least this is more readable than a spark df. UPDATE: You can get that table look if after the last variable, you convert it to pandas df if you prefer that look. There could be another way or a more efficient way to do this, but so far this one works. pd.DataFrame(new_df)

Related questions

0 votes
    How can I convert an RDD to a dataframe? I converted a data frame to rdd using .rdd. After processing it I ... convert it back to rdd Select the correct answer from above options...
asked Jan 21, 2022 in Education by JackTerrance
0 votes
    How can I convert an RDD to a dataframe? I converted a data frame to rdd using .rdd. After processing ... ,Core Questions, Core Hadoop MCQ,core interview questions for experienced...
asked Oct 31, 2021 in Education by JackTerrance
0 votes
    Currently I'm returning column name of the max value in the each row. df['Active'] = df.idxmax( ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 5, 2022 in Education by JackTerrance
0 votes
    Currently I'm returning column name of the max value in the each row. df['Active'] = df.idxmax( ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 2, 2022 in Education by JackTerrance
0 votes
    I am loading a (5gb compressed file) into memory (aws), creating a dataframe(in spark) and trying ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Jun 12, 2022 in Education by JackTerrance
0 votes
    I'am trying to follow this tutorial: ... for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Jun 2, 2022 in Education by JackTerrance
0 votes
    I'am trying to follow this tutorial: ... for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 29, 2022 in Education by JackTerrance
0 votes
    I have a dataframe as below itm Date Amount 67 420 2012-09-30 00:00:00 65211 68 421 2012-09-09 00 ... solutions would be appreciated. Select the correct answer from above options...
asked Jan 28, 2022 in Education by JackTerrance
0 votes
    I want to select rows from a DataFrame based on values in some column in pandas, How can I do it? I ... WHERE column_name = some_value Select the correct answer from above options...
asked Jan 22, 2022 in Education by JackTerrance
0 votes
    I have 2 columns in the python dataframe. I want to check each row in my Column A for any value that ... for this particular purpose. Select the correct answer from above options...
asked Jan 9, 2022 in Education by JackTerrance
0 votes
    I have a Python dictionary like the following: {u'2012-06-08': 388, u'2012-06-09': 388, u'2012-06-10 ... (my_dict,index=my_dict.keys()) Select the correct answer from above options...
asked Jan 27, 2022 in Education by JackTerrance
0 votes
    For instance, given ( 1, 2, 5, 6, 7), i'd like to determine that 3 and 4 are missing? ... questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Jun 14, 2022 in Education by JackTerrance
0 votes
    For instance, given ( 1, 2, 5, 6, 7), i'd like to determine that 3 and 4 are missing? ... questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Jun 5, 2022 in Education by JackTerrance
0 votes
    missing data in pandas object is represented thought : 1.null 2.none 3.missing 4.nan Select the correct answer from above options...
asked Dec 1, 2021 in Education by JackTerrance
0 votes
    for my university assignment, I have to produce a csv file with all the distances of the airports of ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 22, 2022 in Education by JackTerrance
...