pycorrelator package

Tests

pycorrelator.catalog module

class pycorrelator.catalog.Catalog(data)[source]

Bases: object

This class is used to store and manipulate the catalog data for xmatch and fof.

Parameters:

data (array-like) –

The input data can be either a numpy array or a pandas dataframe.

  • np.array: The array must have a shape of (N, 2), representing N points with two values: [ra (azimuth, longitude), dec (alltitude, latitude)].

  • pd.DataFrame: The dataframe must have two columns named ‘Ra’ and ‘Dec’ (or all the possible combinations with ‘ra’, ‘dec’; ‘RA’, ‘DEC’).

get_appending_data(retain_all_columns=True, retain_columns=None, invalid_key_error=True) DataFrame[source]

Get the appending data of the points in the catalog for xmatch and fof.

Parameters:
  • retain_all_columns (bool, optional) – Whether to retain all the columns in the input dataframe. Default is True.

  • retain_columns (list, optional) – The list of columns to retain in the input dataframe. Overrides retain_all_columns if not empty.

  • invalid_key_error (bool, optional) – Whether to raise an error when the columns are not in the input dataframe. Default is True.

Returns:

The dataframe of the appending data.

Return type:

pandas.DataFrame

get_coordiantes() ndarray[Any, dtype[float64]][source]

Get the coordinate of the points in the catalog for xmatch and fof.

Returns:

The array of shape (N, 2) with [Ra, Dec].

Return type:

numpy.ndarray

get_indexes() ndarray[Any, dtype[int64]][source]

Get the indexes of the points in the catalog for xmatch and fof.

Returns:

The array of indexes of shape (N,).

Return type:

numpy.ndarray

pycorrelator.chunk module

class pycorrelator.chunk.Chunk(chunk_id, ra, dec, discription=None)[source]

Bases: object

add_boundary_data(data, index)[source]
add_central_data(data, index)[source]
farest_distance(distance=None)[source]
get_center()[source]
get_data() ndarray[Any, dtype[float64]][source]
get_index() ndarray[Any, dtype[int64]][source]

pycorrelator.chunk_generator module

class pycorrelator.chunk_generator.ChunkGenerator(margin)[source]

Bases: object

coor2id_boundary(ra: ndarray[Any, dtype[_ScalarType_co]], dec: ndarray[Any, dtype[_ScalarType_co]])[source]

Tell which boundary of chunk the given coordinate assigned to. (How to divide the sky.)

— SHOULD BE overridden by subclass —

If the object is located within the 2 * margin with the boundary of a chunk, it should be contained in the list of object indexes of that chunk.

Parameters:
  • ra (numpy.ndarray) – The array of RA. Shape: (N,).

  • dec (numpy.ndarray) – The array of Dec. Shape: (N,).

Returns:

list_of_chunk_of_list_of_object_index – A list of lists contain the index of the object.

Return type:

list

coor2id_central(ra: ndarray[Any, dtype[_ScalarType_co]], dec: ndarray[Any, dtype[_ScalarType_co]])[source]

Tell which chunk the given coordinate belongs to. (How to divide the sky.)

— SHOULD BE overridden by subclass —

Parameters:
  • ra (numpy.ndarray) – The array of RA. Shape: (N,).

  • dec (numpy.ndarray) – The array of Dec. Shape: (N,).

Returns:

chink_id – The array of chunk_id. Shape: (N,).

Return type:

numpy.ndarray

distribute(catalog: Catalog) list[Chunk][source]

Distribute the data into chunks.

Parameters:

catalog (Catalog) – The catalog to be distributed.

Returns:

chunks – List of chunks with data.

Return type:

list[Chunk]

generate()[source]

Generate the chunks.

— SHOULD BE overridden by subclass —

get_chunk(chunk_id)[source]

pycorrelator.chunk_generator_grid module

class pycorrelator.chunk_generator_grid.ChunkGeneratorByDenseGrid(margin)[source]

Bases: GridChunkGenerator

class pycorrelator.chunk_generator_grid.ChunkGeneratorByGrid(margin)[source]

Bases: GridChunkGenerator

class pycorrelator.chunk_generator_grid.ChunkGeneratorBySuperDenseGrid(margin)[source]

Bases: GridChunkGenerator

class pycorrelator.chunk_generator_grid.GridChunkConfig(center, margin, width: tuple | None = None, dec_bound: float | None = None)[source]

Bases: object

get_max_radius()[source]
class pycorrelator.chunk_generator_grid.GridChunkGenerator(margin)[source]

Bases: ChunkGenerator

add_polar_config(center, dec_bound)[source]
add_ring_config(center, width)[source]
coor2id_boundary(ra, dec)[source]

Tell which boundary of chunk the given coordinate assigned to. (How to divide the sky.)

— SHOULD BE overridden by subclass —

If the object is located within the 2 * margin with the boundary of a chunk, it should be contained in the list of object indexes of that chunk.

Parameters:
  • ra (numpy.ndarray) – The array of RA. Shape: (N,).

  • dec (numpy.ndarray) – The array of Dec. Shape: (N,).

Returns:

list_of_chunk_of_list_of_object_index – A list of lists contain the index of the object.

Return type:

list

coor2id_central(ra, dec)[source]

Tell which chunk the given coordinate belongs to. (How to divide the sky.)

— SHOULD BE overridden by subclass —

Parameters:
  • ra (numpy.ndarray) – The array of RA. Shape: (N,).

  • dec (numpy.ndarray) – The array of Dec. Shape: (N,).

Returns:

chink_id – The array of chunk_id. Shape: (N,).

Return type:

numpy.ndarray

generate()[source]

Generate the chunks.

— SHOULD BE overridden by subclass —

get_all_config()[source]
set_symmetric_ring_chunk(polar_dec, Ns_horizontal_ring)[source]

pycorrelator.disjoint_set module

class pycorrelator.disjoint_set.DisjointSet(n)[source]

Bases: object

find(x)[source]
get_groups()[source]
union(x, y)[source]

pycorrelator.euclidean_vs_angular_distance_local module

pycorrelator.euclidean_vs_angular_distance_local.compute_error(declination, distance)[source]

Purpose: Compute the relative error in Euclidean distance given declination and angular distance.

Parameters: - declination: float, the declination in degrees - distance: float, the angular distance in degrees

Returns: - error: float, the computed relative error defined as (Euclidean - angular) / angular.

pycorrelator.euclidean_vs_angular_distance_local.compute_max_relative_error(dec, distances, theta_values)[source]

pycorrelator.fof module

pycorrelator.fof.fof(catalog, tolerance) FoFResult[source]

Perform the Friends-of-Friends (FoF) grouping algorithm on a catalog.

This function applies the FoF algorithm to a given catalog. The algorithm works by linking objects that are within a specified angular distance (tolerance) of each other, forming groups or clusters of objects.

Parameters:
  • catalog (array-like) – The catalog to group.

  • tolerance (float) – The tolerance for the grouping in degrees.

Returns:

The result of the Friends-of-Friends grouping.

Return type:

FoFResult

pycorrelator.fof.group_by_quadtree(catalog, tolerance, dec_bound=None, ring_chunk=None) FoFResult[source]
pycorrelator.fof.group_by_quadtree_chunk(args: tuple[Chunk, float])[source]
pycorrelator.fof.spherical_quadtree_grouping(original_indexes: array, coordinate: array, tolerance, A2E_factor)[source]

pycorrelator.result_fof module

class pycorrelator.result_fof.FoFResult(catalog: Catalog, tolerance: float, result_list: list)[source]

Bases: object

get_coordinates() list[list[tuple]][source]

Returns the coordinates of objects grouped as lists of tuples.

Returns:

A list of lists of tuples of coordinates of objects in each group.

Return type:

list[list[tuple]]

get_group_coordinates() list[tuple][source]

Returns the center coordinates of the groups.

Returns:

A list of tuples of coordinates of the center of each group.

Return type:

list[tuple]

get_group_dataframe(min_group_size=1, coord_columns=['Ra', 'Dec'], retain_all_columns=True, retain_columns=None) DataFrame[source]

Get the grouped data as a two-level indexed pandas DataFrame.

Parameters:
  • min_group_size (int, optional) – The minimum group size to include in the DataFrame. Default is 1.

  • coord_columns (list[str], optional) – The names of the columns for the coordinates. Default is [‘Ra’, ‘Dec’].

  • retain_all_columns (bool, optional) – Whether to retain all the columns in the input (dataframe). Default is True.

  • retain_columns (list[str], optional) – The names of the columns to retain in the output dataframe. Will override retain_all_columns if not empty. Default is None.

Returns:

A two-level indexed pandas DataFrame containing the grouped data.

Return type:

pandas.DataFrame

get_group_sizes() list[int][source]

Returns the object counts in each group.

Returns:

A list of integers representing the number of objects in each group.

Return type:

list[int]

pycorrelator.result_xmatch module

class pycorrelator.result_xmatch.XMatchResult(cat1: Catalog, cat2: Catalog, tolerance, result_dict: defaultdict)[source]

Bases: object

get_dataframe1(min_match=0, coord_columns=['Ra', 'Dec'], retain_all_columns=True, retain_columns=None) DataFrame[source]

Get the first catalog with the number of matches as a pandas dataframe.

Parameters:
  • min_match (int, optional) – The minimum number of matches for an object to be included in the dataframe. Default is 0.

  • coord_columns (list[str], optional) – The names of the columns for the coordinates. Default is [‘Ra’, ‘Dec’].

  • retain_all_columns (bool, optional) – Whether to retain all the columns in the input (dataframe). Default is True.

  • retain_columns (list[str], optional) – The names of the columns to retain in the output dataframe. Will override retain_all_columns if not empty. Default is None.

Returns:

The dataframe of the first catalog with the number of matches.

Return type:

pandas.DataFrame

get_dataframe2(min_match=0, coord_columns=['Ra', 'Dec'], retain_all_columns=True, retain_columns=None) DataFrame[source]

Get the second catalog with the number of matches as a pandas dataframe.

Please refer to the get_dataframe1() and replace the ‘first catalog’ with the ‘second catalog’.

get_result_dict() defaultdict[source]
get_result_dict_reserve() defaultdict[source]
get_serial_dataframe(min_match=1, reverse=False, coord_columns=['Ra', 'Dec'], retain_all_columns=True, retain_columns=None) DataFrame[source]

Get a pandas dataframe with the information of the matching of the two catalogs in a serial manner.

Each object from the first catalog with sufficient matches (as defined by min_match) appear first, followed by their matched objects from the second catalog.

Parameters:
  • min_match (int, optional) – The minimum number of matches for an object from the first catalog to be included in the dataframe. Default is 1.

  • reverse (bool, optional) – Whether to reverse the order of catalogs (i.e., make the second catalog as the first and vice versa). Default is False.

  • coord_columns (list[str], optional) – The names of the columns for the coordinates. Default is [‘Ra’, ‘Dec’].

  • retain_all_columns (bool, optional) – Whether to retain all the columns in the input (dataframe). Default is True.

  • retain_columns (list[str], optional) – The names of the columns to retain in the output dataframe. Will override retain_all_columns if not empty. Default is None.

Returns:

The serial dataframe of the two catalogs with the number of matches.

Return type:

pandas.DataFrame

number_distribution() Counter[source]

Get the distribution of the number of matches for each object in the first catalog.

Returns:

The distribution of the number of matches for each object in the first catalog.

Return type:

collections.Counter

pycorrelator.utilities_spherical module

pycorrelator.utilities_spherical.cartesian_to_radec(cartesian_coords)[source]

Convert Cartesian coordinates to Right Ascension and Declination.

Parameters:

cartesian_coords (np.array) – Array of Cartesian coordinates [x, y, z] SHOULD BE NORMALIZED.

Returns:

(RA, DEC) in degrees.

Return type:

tuple

pycorrelator.utilities_spherical.distances_to_target(target, points)[source]

Compute the great-circle distances from a target point to a list of other points on a sphere.

This function can also handles a single point as an input.

Parameters:
  • target (tuple) – (RA, DEC) of the target point.

  • points (numpy.ndarray | tuple) – numpy array of shape (n, 2) where n is the number of points. Each row is (RA, DEC) for a point. Can also be a single point with shape (2,).

Returns:

distances – Great-circle distances to the target point.

Return type:

numpy.ndarray

pycorrelator.utilities_spherical.generate_random_point(n, seed=None)[source]

Generate random points in Right Ascension and Declination uniformly distributed on the celestial sphere.

Parameters:
  • n (int) – Number of random points to generate.

  • seed (int, optional) – Seed for the random number generator.

Returns:

(RA, DEC) arrays in degrees.

Return type:

tuple

pycorrelator.utilities_spherical.great_circle_distance(ra1, dec1, ra2, dec2)[source]

Compute the great-circle distance between two points on a sphere using their right ascension and declination.

Parameters:
  • ra1 (float) – Right ascension of the first point in degrees.

  • dec1 (float) – Declination of the first point in degrees.

  • ra2 (float) – Right ascension of the second point in degrees.

  • dec2 (float) – Declination of the second point in degrees.

Returns:

distance – Angular distance between the two points in degrees.

Return type:

float

pycorrelator.utilities_spherical.point_offset(ra_dec, angular_distance, theta)[source]

Give a point that is a given angular distance away from a specified point on the celestial sphere.

Parameters:
  • ra_dec (tuple) – (RA, DEC) in degrees for the initial point.

  • angular_distance (float) – Distance in degrees to move from the initial point.

  • theta (float) – Direction in degrees counter-clockwise from the positive DEC axis when viewed from the center of the celestial sphere.

Returns:

new_point – (RA, DEC) in degrees for the point after offset.

Return type:

tuple

Note

The direction specified by theta is counter-clockwise when viewed from the center of the celestial sphere, looking outwards. If visualizing from a point above the North Celestial Pole, the direction will appear clockwise.

pycorrelator.utilities_spherical.radec_to_cartesian(ra, dec)[source]

Convert Right Ascension and Declination to Cartesian coordinates.

Parameters:
  • ra (float) – Right Ascension in degrees.

  • dec (float) – Declination in degrees.

Returns:

Cartesian coordinates [x, y, z].

Return type:

np.array

pycorrelator.utilities_spherical.rodrigues_rotation(v, k, theta)[source]

Rotate a vector using Rodrigues’ rotation formula.

Parameters:
  • v (np.array) – Vector to be rotated.

  • k (np.array) – Unit vector indicating the axis of rotation.

  • theta (float) – Angle of rotation in degrees.

Returns:

Rotated vector.

Return type:

np.array

pycorrelator.utilities_spherical.rotate_radec_about_axis(ra, dec, axis_ra, axis_dec, theta)[source]

Rotate a point (or points) in celestial coordinates about a specified axis.

Given a point (or an array of points) defined by its Right Ascension and Declination, this function rotates it about an arbitrary axis (defined by its own RA and Dec) by a specified angle.

Parameters:
  • ra (float or np.array) – Right Ascension of the point(s) to be rotated. Can be a single value or an array of values.

  • dec (float or np.array) – Declination of the point(s) to be rotated. Can be a single value or an array of values.

  • axis_ra (float) – Right Ascension of the rotation axis. Expected to be a scalar.

  • axis_dec (float) – Declination of the rotation axis. Expected to be a scalar.

  • theta (float) – Angle of rotation in degrees. Expected to be a scalar.

Returns:

If ra and dec are scalars: Returns a tuple (rotated_RA, rotated_Dec) of scalar values. If ra and dec are arrays: Returns a tuple of arrays (rotated_RAs, rotated_Decs).

Return type:

tuple | tuple[numpy.array]

pycorrelator.xmatch module

pycorrelator.xmatch.rotate_to_center(object_coor, chunk_ra, chunk_dec)[source]
pycorrelator.xmatch.spherical_xmatching(idx1: array, coor1: array, idx2: array, coor2: array, tolerance, A2E_factor)[source]
pycorrelator.xmatch.unique_merge_defaultdicts(d1: defaultdict, d2: defaultdict)[source]

Joins two dictionaries, merging values for shared keys and preserving others.

When both dictionaries have the same key, this function makes a new list with every distinct value from either dictionary. If a key is only in one dictionary, it adds that key and its values directly to the result.

Parameters:
  • d1 (defaultdict) – A dictionary with list-type values.

  • d2 (defaultdict) – Another dictionary with list-type values.

Returns:

A dictionary with all keys from both d1 and d2. For shared keys, it has a list of unique values. For unshared keys, it has the original list.

Return type:

defaultdict

pycorrelator.xmatch.xmatch(catalog1, catalog2, tolerance, verbose=True) XMatchResult[source]

Performs a cross-match between two catalogs.

This function matches objects from two different catalogs based on their coordinates. Objects from catalog1 and catalog2 that are within a specified angular distance (tolerance) are considered matches.

Parameters:
  • catalog1 (array-like) – The first catalog.

  • catalog2 (array-like) – The second catalog.

  • tolerance (float) – The tolerance for the cross-match in degrees.

  • verbose (bool, optional) – Whether to print the progress.

Returns:

A XMatchResult object that contains the cross-match result.

Return type:

XMatchResult

pycorrelator.xmatch.xmatch_chunk(args: tuple[Chunk, Chunk, float])[source]

Module contents

pycorrelator.fof(catalog, tolerance) FoFResult[source]

Perform the Friends-of-Friends (FoF) grouping algorithm on a catalog.

This function applies the FoF algorithm to a given catalog. The algorithm works by linking objects that are within a specified angular distance (tolerance) of each other, forming groups or clusters of objects.

Parameters:
  • catalog (array-like) – The catalog to group.

  • tolerance (float) – The tolerance for the grouping in degrees.

Returns:

The result of the Friends-of-Friends grouping.

Return type:

FoFResult

pycorrelator.group_by_quadtree(catalog, tolerance, dec_bound=None, ring_chunk=None) FoFResult[source]
pycorrelator.xmatch(catalog1, catalog2, tolerance, verbose=True) XMatchResult[source]

Performs a cross-match between two catalogs.

This function matches objects from two different catalogs based on their coordinates. Objects from catalog1 and catalog2 that are within a specified angular distance (tolerance) are considered matches.

Parameters:
  • catalog1 (array-like) – The first catalog.

  • catalog2 (array-like) – The second catalog.

  • tolerance (float) – The tolerance for the cross-match in degrees.

  • verbose (bool, optional) – Whether to print the progress.

Returns:

A XMatchResult object that contains the cross-match result.

Return type:

XMatchResult