I encountered several hurdles setting up apache-beam[gcp] on my M1 Mac. I ended up not being able to get a local install to work, so I set up a Docker local environment.

Turns out that you can connect to a local Docker container as a “remote” in JetBrains IDEs. But it doesn’t support dynamic package installation because the containers are ephemeral. So you need to install the packages in the image’s build, and then rebuild the image every time you add a package.

Another idea is to SSH into the Docker container. Another path might have been to use Rosetta.

fastavro

It depends on fastavro, which wouldn’t install because it couldn’t find python.h in the Apple-provided version of Python (3.9.6). Using a Homebrew-installed version of Python fixed it. I happened to use python@3.11, but probably anything would have worked.

pyarrow

It depends on pyarrow, which won’t install until you jump through a few hoops:

  1. Install cmake by downloading the dmg and copying the app to Applications.
  2. Install pyarrow separately as follows:
pip install --no-use-pep517 --no-build-isolation pyarrow
  1. Nope, never mind, this shit doesn’t work