pyspark.sql.functions.array_max#
- pyspark.sql.functions.array_max(col)[source]#
Array function: returns the maximum value of the array.
New in version 2.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- col
Column
or str The name of the column or an expression that represents the array.
- col
- Returns
Column
A new column that contains the maximum value of each array.
Examples
Example 1: Basic usage with integer array
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([2, 1, 3],), ([None, 10, -1],)], ['data']) >>> df.select(sf.array_max(df.data)).show() +---------------+ |array_max(data)| +---------------+ | 3| | 10| +---------------+
Example 2: Usage with string array
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(['apple', 'banana', 'cherry'],)], ['data']) >>> df.select(sf.array_max(df.data)).show() +---------------+ |array_max(data)| +---------------+ | cherry| +---------------+
Example 3: Usage with mixed type array
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(['apple', 1, 'cherry'],)], ['data']) >>> df.select(sf.array_max(df.data)).show() +---------------+ |array_max(data)| +---------------+ | cherry| +---------------+
Example 4: Usage with array of arrays
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([[2, 1], [3, 4]],)], ['data']) >>> df.select(sf.array_max(df.data)).show() +---------------+ |array_max(data)| +---------------+ | [3, 4]| +---------------+
Example 5: Usage with empty array
>>> from pyspark.sql import functions as sf >>> from pyspark.sql.types import ArrayType, IntegerType, StructType, StructField >>> schema = StructType([ ... StructField("data", ArrayType(IntegerType()), True) ... ]) >>> df = spark.createDataFrame([([],)], schema=schema) >>> df.select(sf.array_max(df.data)).show() +---------------+ |array_max(data)| +---------------+ | NULL| +---------------+