API documentation
Introduction
The project consists of five top level modules, namely
so_ana_analysis: containing the Jupyter notebooks for analysis
so_ana_doc_worker: any code chunks related to processing documents
so_ana_management: all tools for managing the overall flow
so_ana_util: several utilities
sqlalchemy_models containing the relational database models and related functionality
Moreover node that data base migrations are managed using alembic .
The migration code is placed in folder alembic
command line application so_ana.py
package so_ana_doc_worker
contains submodules with code related to processing documents.
module so_ana_doc_worker.extr_post_deps
module so_ana_doc_worker.extract_posts
module so_ana_doc_worker.LDA
module so_ana_doc_worker.schemas
sub-module so_ana_doc_worker.so_ana_process_posts
module so_ana_doc_worker.so_ana_reporting
package so_ana_management
Contains any code related to workflow management.
module so_ana_management.flow
sub-module so_ana_management.flow_services
contains flow services to be executed in flow.py
hint: the module is work in progress for the first refactoring/ the final target is that the flow module does not contain any linking business logic but rather calls flow-services to do the processing.
Author: HBernigau Date: 01.2022
module so_ana_management.management_deps
module so_ana_management.management_utils
package so_ana_util
Contains several utilities which are not directly related to any specific context. On package level the modul provides some important directories:
PROJ_ROOT_PATH is the root path for the project
PROJ_DATA_PATH is the path to data
PROJ_CONFIG_PATH is the path containing configurations
PROJ_OUTP_PATH is the root for output data
Furthermore the function get_main_config loads the configuration file for the current run.
module so_ana_util.common_types
module so_ana_util.data_access
module so_ana_util.error_handling
module so_ana_util.so_ana_json
package so_ana_sqlalchemy_models
Contains code related to data base access.