Compress MS (`compressms`)#

visco.compress_ms.write_ms_to_zarr(ms_path: str, zarr_path: str, consolidated: bool, chunk_size_row: int, overwrite: bool, compressor: str, level: int)#

Convert a Measurement Set to a Zarr store.

Parameters:

ms_path (str) – The path to the Measurement Set.
zarr_path (str) – The path to the Zarr store.
consolidated (bool, optional) – Whether to use a consolidated Zarr store.
chunk_size_row (int, optional) – The chunk size for the rows.
overwrite (bool, optional) – Whether to overwrite the Zarr store if it exists.
compressor (str, optional) – The name of the compressor to use.
level (int, optional) – The compression level to use.

visco.compress_ms.compress_visdata(zarr_output_path: str, compressor: str, level: int, correlation: str, correlation_optimized: bool, fieldid: int, ddid: int, scan: int, column: str, outcolumn: str, batch_size: int, flag_estimate: bool, use_model_data: bool, model_data: Optional[Array] = None, decorrelation: Optional[float] = None, compressionrank: Optional[int] = None, flagvalue: Optional[int] = None, antennas: Optional[list] = None)#

Compress visibility data using SVD with batched processing.

Parameters:

zarr_output_path (str) – Path to the Zarr store.
compressor (str) – Name of the compressor to use.
level (int) – Compression level to use.
correlation (str) – Comma-separated list of correlation types to process (e.g., ‘XX,YY,XY,YX’).
correlation_optimized (bool) – Whether to use optimized correlation processing (XX/YY and XY/YX together).
fieldid (int) – FIELD_ID to filter on.
ddid (int) – DATA_DESC_ID to filter on.
scan (int) – SCAN_NUMBER to filter on.
column (str) – Column in the MAIN table containing the visibility data to compress.
outcolumn (str) – Column name to store the compressed data.
batch_size (int) – Number of baselines to process in each batch.
flag_estimate (bool) – Whether to estimate flagged data using interpolation.
use_model_data (bool) – Whether to replace flagged data with model data.
model_data (str, optional) – Column name for model data if use_model_data is True.
decorrelation (float, optional) – Desired decorrelation level (0 to 1).
compressionrank (int, optional) – Number of singular values to keep.
flagvalue (int, optional) – Value to replace flagged data with if specified.
antennas (list, optional) – List of antenna names to restrict processing to specific baselines.

visco.compress_ms.compress_full_ms(ms_path: str, zarr_path: str, consolidated: bool, chunk_size_row: int, overwrite: bool, compressor: str, level: int, nworkers: int, nthreads: int, memory_limit: str, direct_to_workers: bool, correlation: str, correlation_optimized: bool, fieldid: int, ddid: int, scan: int, column: str, outcolumn: str, batch_size: int, dashboard_addr: Optional[str] = None, host_addr: Optional[str] = None, use_model_data: bool = False, model_data: Optional[str] = None, flag_estimate: bool = False, decorrelation: Optional[float] = None, compressionrank: Optional[int] = None, flagvalue: Optional[int] = None, antennas: Optional[list] = None)#

Compress a Measurement Set using SVD with batched processing.

Parameters:

ms_path (str) – Path to the Measurement Set.
zarr_path (str) – Path to the Zarr store.
consolidated (bool) – Whether to use a consolidated Zarr store.
chunk_size_row (int) – Chunk size for the rows.
overwrite (bool) – Whether to overwrite the Zarr store if it exists.
compressor (str) – Name of the compressor to use.
level (int) – Compression level to use.
nworkers (int) – Number of Dask workers.
nthreads (int) – Number of threads per worker.
memory_limit (str) – Memory limit per worker (e.g., ‘4GB’).
direct_to_workers (bool) – Whether to send tasks directly to workers.
correlation (str) – Comma-separated list of correlation types to process (e.g., ‘XX,YY,XY,YX’).
correlation_optimized (bool) – Whether to use optimized correlation processing (XX/YY and XY/YX together).
fieldid (int) – FIELD_ID to filter on.
ddid (int) – DATA_DESC_ID to filter on.
scan (int) – SCAN_NUMBER to filter on.
column (str) – Column in the MAIN table containing the visibility data to compress.
outcolumn (str) – Column name to store the compressed data.
batch_size (int) – Number of baselines to process in each batch.
dashboard_addr (str, optional) – Address for the Dask dashboard.
host_addr (str, optional) – Host address for the Dask scheduler.
use_model_data (bool, optional) – Whether to replace flagged data with model data.
model_data (str, optional) – Column name for model data if use_model_data is True.
flag_estimate (bool, optional) – Whether to estimate flagged data using interpolation.
decorrelation (float, optional) – Desired decorrelation level (0 to 1).
compressionrank (int, optional) – Number of singular values to keep.
flagvalue (int, optional) – Value to replace flagged data with if specified.
antennas (list, optional) – List of antenna names to restrict processing to specific baselines.

Compress MS (compressms)#

Compress MS (`compressms`)#