Skip to content

Commit e99cb44

Browse files
authored
feat: add config for pca in annlite (#606)
1 parent 262b2d1 commit e99cb44

2 files changed

Lines changed: 10 additions & 8 deletions

File tree

docarray/array/storage/annlite/backend.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ class AnnliteConfig:
2727
ef_construction: Optional[int] = None
2828
ef_search: Optional[int] = None
2929
max_connection: Optional[int] = None
30+
n_components: Optional[int] = None
3031
columns: Optional[Union[List[Tuple[str, str]], Dict[str, str]]] = None
3132

3233

docs/advanced/document-store/annlite.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -38,14 +38,15 @@ Other functions behave the same as in-memory DocumentArray.
3838

3939
The following configs can be set:
4040

41-
| Name | Description | Default |
42-
|-------------------|---------------------------------------------------------------------------------------|---------------------------------------------------------------|
43-
| `n_dim` | Number of dimensions of embeddings to be stored and retrieved | **This is always required** |
44-
| `data_path` | The data folder where the data is located | **A random temp folder** |
45-
| `metric` | Distance metric to be used during search. Can be 'cosine', 'dot' or 'euclidean' | 'cosine' |
46-
| `ef_construction` | The size of the dynamic list for the nearest neighbors (used during the construction) | `None`, defaults to the default value in the AnnLite package* |
47-
| `ef_search` | The size of the dynamic list for the nearest neighbors (used during the search) | `None`, defaults to the default value in the AnnLite package* |
48-
| `max_connection` | The number of bi-directional links created for every new element during construction. | `None`, defaults to the default value in the AnnLite package* |
41+
| Name | Description | Default |
42+
|-------------------|---------------------------------------------------------------------------------------------------------|---------------------------------------------------------------|
43+
| `n_dim` | Number of dimensions of embeddings to be stored and retrieved | **This is always required** |
44+
| `data_path` | The data folder where the data is located | **A random temp folder** |
45+
| `metric` | Distance metric to be used during search. Can be 'cosine', 'dot' or 'euclidean' | 'cosine' |
46+
| `ef_construction` | The size of the dynamic list for the nearest neighbors (used during the construction) | `None`, defaults to the default value in the AnnLite package* |
47+
| `ef_search` | The size of the dynamic list for the nearest neighbors (used during the search) | `None`, defaults to the default value in the AnnLite package* |
48+
| `max_connection` | The number of bi-directional links created for every new element during construction. | `None`, defaults to the default value in the AnnLite package* |
49+
| `n_components` | The output dimension of PCA model. Should be a positive number and less than `n_dim` if it's not `None` | `None`, defaults to the default value in the AnnLite package* |
4950

5051
*You can check the default values in [the AnnLite source code](https://github.com/jina-ai/annlite/blob/main/annlite/core/index/hnsw/index.py)
5152

0 commit comments

Comments
 (0)