pyspark.pandas.date_range

pyspark.pandas.date_range(start: Union[str, Any] = None, end: Union[str, Any] = None, periods: Optional[int] = None, freq: Union[str, pandas._libs.tslibs.offsets.DateOffset, None] = None, tz: Union[str, datetime.tzinfo, None] = None, normalize: bool = False, name: Optional[str] = None, closed: Optional[str] = None, **kwargs: Any) → pyspark.pandas.indexes.datetimes.DatetimeIndex[source]

Return a fixed frequency DatetimeIndex.

Parameters
startstr or datetime-like, optional

Left bound for generating dates.

endstr or datetime-like, optional

Right bound for generating dates.

periodsint, optional

Number of periods to generate.

freqstr or DateOffset, default ‘D’

Frequency strings can have multiples, e.g. ‘5H’.

tzstr or tzinfo, optional

Time zone name for returning localized DatetimeIndex, for example ‘Asia/Hong_Kong’. By default, the resulting DatetimeIndex is timezone-naive.

normalizebool, default False

Normalize start/end dates to midnight before generating date range.

namestr, default None

Name of the resulting DatetimeIndex.

closed{None, ‘left’, ‘right’}, optional

Make the interval closed with respect to the given frequency to the ‘left’, ‘right’, or both sides (None, the default).

**kwargs

For compatibility. Has no effect on the result.

Returns
rngDatetimeIndex

See also

DatetimeIndex

An immutable container for datetimes.

Notes

Of the four parameters start, end, periods, and freq, exactly three must be specified. If freq is omitted, the resulting DatetimeIndex will have periods linearly spaced elements between start and end (closed on both sides).

To learn more about the frequency strings, please see this link.

Examples

Specifying the values

The next four examples generate the same DatetimeIndex, but vary the combination of start, end and periods.

Specify start and end, with the default daily frequency.

>>> ps.date_range(start='1/1/2018', end='1/08/2018')  
DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08'],
              dtype='datetime64[ns]', freq=None)

Specify start and periods, the number of periods (days).

>>> ps.date_range(start='1/1/2018', periods=8)  
DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08'],
              dtype='datetime64[ns]', freq=None)

Specify end and periods, the number of periods (days).

>>> ps.date_range(end='1/1/2018', periods=8)  
DatetimeIndex(['2017-12-25', '2017-12-26', '2017-12-27', '2017-12-28',
               '2017-12-29', '2017-12-30', '2017-12-31', '2018-01-01'],
              dtype='datetime64[ns]', freq=None)

Specify start, end, and periods; the frequency is generated automatically (linearly spaced).

>>> ps.date_range(
...     start='2018-04-24', end='2018-04-27', periods=3
... )  
DatetimeIndex(['2018-04-24 00:00:00', '2018-04-25 12:00:00',
               '2018-04-27 00:00:00'],
              dtype='datetime64[ns]', freq=None)

Other Parameters

Changed the freq (frequency) to 'M' (month end frequency).

>>> ps.date_range(start='1/1/2018', periods=5, freq='M')  
DatetimeIndex(['2018-01-31', '2018-02-28', '2018-03-31', '2018-04-30',
               '2018-05-31'],
              dtype='datetime64[ns]', freq=None)

Multiples are allowed

>>> ps.date_range(start='1/1/2018', periods=5, freq='3M')  
DatetimeIndex(['2018-01-31', '2018-04-30', '2018-07-31', '2018-10-31',
               '2019-01-31'],
              dtype='datetime64[ns]', freq=None)

freq can also be specified as an Offset object.

>>> ps.date_range(
...     start='1/1/2018', periods=5, freq=pd.offsets.MonthEnd(3)
... )  
DatetimeIndex(['2018-01-31', '2018-04-30', '2018-07-31', '2018-10-31',
               '2019-01-31'],
              dtype='datetime64[ns]', freq=None)

closed controls whether to include start and end that are on the boundary. The default includes boundary points on either end.

>>> ps.date_range(
...     start='2017-01-01', end='2017-01-04', closed=None
... )  
DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04'],
               dtype='datetime64[ns]', freq=None)

Use closed='left' to exclude end if it falls on the boundary.

>>> ps.date_range(
...     start='2017-01-01', end='2017-01-04', closed='left'
... )  
DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03'], dtype='datetime64[ns]', freq=None)

Use closed='right' to exclude start if it falls on the boundary.

>>> ps.date_range(
...     start='2017-01-01', end='2017-01-04', closed='right'
... )  
DatetimeIndex(['2017-01-02', '2017-01-03', '2017-01-04'], dtype='datetime64[ns]', freq=None)