spherex-cutoutdb install help¶
This guide installs spherex-cutoutdb for both the original Quick Start
workflow and the integrated catalog-to-spectrum workflow. It assumes conda is
already installed.
Use conda-forge for the scientific stack, then install this repository with
pip --no-deps. This avoids pip replacing conda binary packages such as
astropy, photutils, pyarrow, and matplotlib.
Dependency map¶
| Workflow layer | Main dependencies |
|---|---|
| Core package and CLI | python>=3.10, setuptools, wheel, pydantic, PyYAML, rich, tqdm, pandas, pyarrow, Python stdlib sqlite3 |
| Catalog, discovery, planner, downloader | astropy, pyvo, astroquery, requests, pandas |
| Calibration cache | astropy, numpy, requests; durable files under project cache/calibrations |
| V5 photometry and integrated workflow | numpy, astropy, photutils, matplotlib, pandas, plus downloader and calibration dependencies |
| Development and tests | pytest, pytest-cov, responses, build |
Recommended conda install¶
Run these commands from the repository root.
conda create -n spxcutdb python=3.11 -y
conda activate spxcutdb
conda config --env --add channels conda-forge
conda config --env --set channel_priority strict
conda install -y astropy photutils pyvo astroquery requests pydantic pyyaml rich tqdm pandas pyarrow matplotlib pytest pytest-cov responses build setuptools wheel pip
python -m pip install -e ".[dev]" --no-deps
For a runtime-only install without test/build extras, install the runtime packages and then install the repository without dependencies:
conda create -n spxcutdb python=3.11 -y
conda activate spxcutdb
conda config --env --add channels conda-forge
conda config --env --set channel_priority strict
conda install -y astropy photutils pyvo astroquery requests pydantic pyyaml rich tqdm pandas pyarrow matplotlib setuptools wheel pip
python -m pip install -e . --no-deps
If you already created the environment, activate it before running any
spxcutdb command:
conda activate spxcutdb
macOS¶
Install Xcode command line tools only if git, compilers, or headers are
missing:
xcode-select --install
Then use the recommended conda install above. On Apple Silicon, conda-forge
packages should install native osx-arm64 builds when the conda installation is
native arm64. Avoid mixing Intel and Apple Silicon conda environments.
Verify the installed command:
spxcutdb --help
WSL / Ubuntu¶
Install basic system tools first:
sudo apt-get update
sudo apt-get install -y build-essential git curl ca-certificates
Keep the repository and project data inside the WSL Linux filesystem, for
example under ~/work/, not under /mnt/c/. The downloader and SQLite state
files are much slower and more fragile on mounted Windows paths.
For large downloader runs, raise the open-file limit in the current shell:
ulimit -n 4096
Then use the recommended conda install above.
Linux CentOS 7.9¶
CentOS 7.9 is old and end-of-life. Do not use the system Python for this project. Use conda-forge binary packages and avoid pip source builds whenever possible.
Install basic tools:
sudo yum install -y git curl ca-certificates bzip2 tar gzip make gcc gcc-c++
Create the conda environment with conda-forge runtime libraries:
conda create -n spxcutdb python=3.11 -y
conda activate spxcutdb
conda config --env --add channels conda-forge
conda config --env --set channel_priority strict
conda install -y libstdcxx-ng openssl ca-certificates certifi astropy photutils pyvo astroquery requests pydantic pyyaml rich tqdm pandas pyarrow matplotlib pytest pytest-cov responses build setuptools wheel pip
python -m pip install -e ".[dev]" --no-deps
If conda attempts to solve with very new packages that do not support the host,
keep Python at 3.11 and install from conda-forge rather than switching to pip
source builds.
Verify the environment¶
Run these checks from the repository root:
python -m pip check
spxcutdb --help
python -c "import astropy, photutils, pyvo, astroquery, pandas, pyarrow, matplotlib, requests, pydantic, yaml, rich; import spherex_cutoutdb; print('OK')"
pytest -q
pytest -q is the full test suite. It should not require live IRSA network
access by default.
Quick Start smoke¶
Use input_catalog.csv with unique Name, RA_deg, and DEC_deg columns.
The recommended smoke path is the integrated workflow because it records the
effective config, config hash, and CLI overrides for the run.
spxcutdb init ./project --catalog input_catalog.csv --target-id-column Name --force
spxcutdb config show --project ./project --effective --hash
spxcutdb config validate --project ./project
spxcutdb validate --project ./project --catalog input_catalog.csv
spxcutdb discover --project ./project --resume
spxcutdb calibration sync --project ./project --product required --download-source cloud --max-workers 8
spxcutdb calibration validate --project ./project
spxcutdb run --project ./project --catalog input_catalog.csv --download-missing --resume --cleanup-cutouts success-after-source --qa-level standard
spxcutdb summary --project ./project
The older expert path downloads cutouts first and then runs downstream photometry. Use it only when you need to debug the downloader or photometry planner separately:
spxcutdb init ./project --catalog input_catalog.csv --target-id-column Name --force --no-include-deep
spxcutdb catalog validate --project ./project --verbose
spxcutdb catalog ingest --project ./project
spxcutdb discover --project ./project --concurrency 32 --verbose
spxcutdb plan --project ./project --export-plan
spxcutdb download --project ./project --max-workers 32 --verbose
spxcutdb coverage --project ./project
spxcutdb calibration sync --project ./project --product required --download-source cloud --max-workers 8
spxcutdb calibration validate --project ./project
spxcutdb photometry plan --project ./project
Integrated workflow smoke¶
The integrated workflow plans first, skips valid photometry before requesting cutouts, downloads missing cutouts through the existing downloader, runs V5 photometry, writes durable outputs, and safely cleans temporary cutouts.
You can also ask the integrated run to discover and sync calibration before planning:
spxcutdb run --project ./project --catalog input_catalog.csv --discover --sync-calibration --download-missing --resume --cleanup-cutouts success-after-source --qa-level standard
Troubleshooting¶
spxcutdb: command not found¶
Activate the conda environment and reinstall the editable package:
conda activate spxcutdb
python -m pip install -e ".[dev]" --no-deps
python -m pip show spherex-cutoutdb
If the package is installed but the command is still missing, run:
python -m spherex_cutoutdb --help
Binary import failures¶
Errors importing astropy, photutils, pyarrow, or matplotlib usually
mean pip replaced conda packages or the environment mixed incompatible channels.
Create a clean environment with strict conda-forge priority and reinstall the
repository with --no-deps.
SSL or certificate errors¶
Update certificate packages inside the conda environment:
conda install -y -c conda-forge ca-certificates certifi openssl
python -c "import ssl; print(ssl.get_default_verify_paths())"
calibration_missing and zero downloads¶
The integrated workflow checks required calibration before submitting cutouts to
the downloader. If the summary shows calibration_missing, sync and validate
calibration first:
spxcutdb calibration sync --project ./project --product required
spxcutdb calibration validate --project ./project
Then rerun with --resume --download-missing.
WSL slow I/O¶
Move the repository and project directory from /mnt/c/... to a native Linux
path such as ~/work/.... SQLite and many small FITS/JSON writes are much
slower on Windows-mounted paths.
CentOS 7.9 compiler or source-build failures¶
Do not debug old system compilers first. Prefer conda-forge binaries and keep the editable install as:
python -m pip install -e . --no-deps
If pip tries to compile scientific packages, the dependency was not installed from conda-forge.
Matplotlib cache or permission errors¶
The workflow writes PNG SED and QA products. If matplotlib cannot write its cache in a batch environment, set a writable cache directory:
mkdir -p "$PWD/.matplotlib-cache"
export MPLCONFIGDIR="$PWD/.matplotlib-cache"