DataFrame.
__getitem__
Returns the column as a Column.
Column
New in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
column index, column name, column, or a list or tuple of columns
DataFrame
a specified column, or a filtered or projected dataframe.
If the input item is an int or str, the output is a Column.
filtered by this given Column.
projected by this given list or tuple.
Examples
>>> df = spark.createDataFrame([ ... (2, "Alice"), (5, "Bob")], schema=["age", "name"])
Retrieve a column instance.
>>> df.select(df['age']).show() +---+ |age| +---+ | 2| | 5| +---+
>>> df.select(df[1]).show() +-----+ | name| +-----+ |Alice| | Bob| +-----+
Select multiple string columns as index.
>>> df[["name", "age"]].show() +-----+---+ | name|age| +-----+---+ |Alice| 2| | Bob| 5| +-----+---+ >>> df[df.age > 3].show() +---+----+ |age|name| +---+----+ | 5| Bob| +---+----+ >>> df[df[0] > 3].show() +---+----+ |age|name| +---+----+ | 5| Bob| +---+----+