Skip to content Skip to sidebar Skip to footer

How Do I Avoid Type Errors When Internal Function Returns 'Union' That Could Be 'None'?

I've been running into a bit of weirdness with Unions (and Optionals, of course) in Python - namely it seems that the static type checker tests properties against all member of a u

Solution 1:

The underlying function should really be defined as an overload -- I'd suggest a patch to pandas probably

Here's what the type looks like right now:

    def fillna(
        self: FrameOrSeries,
        value=None,
        method=None,
        axis=None,
        inplace: bool_t = False,
        limit=None,
        downcast=None,
    ) -> Optional[FrameOrSeries]: ...

in reality, a better way to represent this is to use an @overload -- the function returns None when inplace = True:

    @overload
    def fillna(
        self: FrameOrSeries,
        value=None,
        method=None,
        axis=None,
        inplace: Literal[True] = False,
        limit=None,
        downcast=None,
    ) -> None: ...


    @overload
    def fillna(
        self: FrameOrSeries,
        value=None,
        method=None,
        axis=None,
        inplace: Literal[False] = False,
        limit=None,
        downcast=None,
    ) -> FrameOrSeries: ...


    def fillna(
        self: FrameOrSeries,
        value=None,
        method=None,
        axis=None,
        inplace: bool_t = False,
        limit=None,
        downcast=None,
    ) -> Optional[FrameOrSeries]:
        # actual implementation

but assuming you can't change the underlying library you have several approaches to unpacking the union. I made a video about this specifically for re.match but I'll reiterate here since it's basically the same problem (Optional[T])

option 1: an assert indicating the expected return type

the assert tells the type checker something it doesn't know: that the type is narrower than it knows about. mypy will trust this assertion and the type will be assumed to be pd.DataFrame

def test_dummy() -> pd.DataFrame:
   df = pd.DataFrame()
   ret = df.fillna(df)
   assert ret is not None
   return ret

option 2: cast

explicitly tell the type checker that the type is what you expect, "cast"ing away the None-ness

from typing import cast

def test_dummy() -> pd.DataFrame:
   df = pd.DataFrame()
   ret = cast(pd.DataFrame, df.fillna(df))
   return ret

type: ignore

the (imo) hacky solution is to tell the type checker to ignore the incompatibility, I would not suggest this approach but it can be helpful as a quick fix

def test_dummy() -> pd.DataFrame:
   df = pd.DataFrame()
   ret = df.fillna(df)
   return ret  # type: ignore

Solution 2:

The pandas.DataFrame.fillna method is defined as returning either DataFrame or None.

If there is a possibility that a function will return None, then this should be documented by using an Optional type hint. It would be wrong to try to hide the fact a function could return None by using a cast or a comment to ignore the warning such as:

return df  # type: ignore

If function could return None, use Optional

import numpy as np
import pandas as pd
from typing import Optional


def test_dummy() -> Optional[pd.DataFrame]:
    df = pd.DataFrame([np.nan, 2, np.nan, 0])
    df = df.fillna(value=0)
    return df

Function guaranteed not to return None, there are these options

If you can guarantee that a function will not return None, but it cannot be statically inferred by a type checker, then there are three options.

Option 1: Use an assertion to indicate that DataFrame is not None

This is the approach recommended by the mypy documentation.

def test_dummy() -> pd.DataFrame:
    df = pd.DataFrame([np.nan, 2, np.nan, 0])
    df = df.fillna(value=0)
    assert df is not None 
    return df

Option 2: Use a cast

from typing import cast

def test_dummy() -> pd.DataFrame:
    df = pd.DataFrame([np.nan, 2, np.nan, 0])
    df = cast(pd.DataFrame, df.fillna(value=0))
    return df

Option 3: Tell mypy to ignore the warning (not recommended)

from typing import cast

def test_dummy() -> pd.DataFrame:
    df = pd.DataFrame([np.nan, 2, np.nan, 0])
    df = df.fillna(value=0)
    return df  # type: ignore

Post a Comment for "How Do I Avoid Type Errors When Internal Function Returns 'Union' That Could Be 'None'?"