pymedextcore package¶
Submodules¶
pymedextcore.annotators module¶
- class pymedextcore.annotators.Annotation(type: str, value: str, source: str, source_ID: str, span: Optional[Tuple[int, int]] = None, attributes: Optional[Dict] = None, isEntity: bool = False, ID: Optional[str] = None, ngram: Optional[str] = None)[source]¶
- Bases: - object- Based object which contains Annotation. Each Annotator must return a list of Annotations. - add_child(child)[source]¶
- Add a child to current Annotation - Parameters
- child – An annotation to set as child of current node 
- Returns
- None 
- Return type
- None 
 
 - add_property(neighbor)[source]¶
- add property of a neighbor to current annnotation, if both have the same span - Parameters
- neighbor – the Annotation neighbor to add the same property 
- Returns
- None 
- Return type
- None 
 
 - get_attributes()[source]¶
- get Attributes from current and parents Node - Returns
- attributes 
- Return type
- a dict 
 
 - get_children_span()[source]¶
- from current node, will return all children span - Returns
- tuple of span 
- Return type
- list of tuple 
 
 - get_entities_children()[source]¶
- From current Node, return all children which are Annotation where isEntity =True entities - Returns
- children list 
- Return type
- list 
 
 - get_parent(from_type)[source]¶
- return closest parent of the current Annotation of a specific type - Parameters
- from_type – specific type to found 
- Returns
- Annotation of a specific type 
- Return type
 
 - get_parents_properties()[source]¶
- return parent properties of current annotations if it’s belong to a specific type - Parameters
- filter_type – list of Annotations types 
- Returns
- list of current and parents Annotation properties 
- Return type
- list of dict 
 
 - get_properties()[source]¶
- return current node Properties if the Annotation is from a specific type - Parameters
- filter_type – list of Annotations type 
- Returns
- properties 
- Return type
- list of dictionnary 
 
 - set_parent(parent)[source]¶
- set Parent to current Annotation - Parameters
- parent – Annotation 
- Returns
- 1 
- Return type
- int 
 
 
- class pymedextcore.annotators.Annotator[source]¶
- Bases: - object- Abstract class of each Annotator. For that purpose an Annotator must implement the function annotate_function(). This function return a list of Annotations object. - annotate_function(_input)[source]¶
- main annotation function each Annotator must implement this function - Parameters
- _input – a list of Annotation typet 
- Returns
- a list of annotations. they will be added to Document.annotations 
- Return type
- List[Annotation] 
 
 - get_all_key_input(_input)[source]¶
- returns all key input for the Annotator - param _input
- return all annotations of a specific types from the Document 
- returns
- a list of annotations 
- rtype
- a list of annotation 
 - Deprecated since version 0.3: This function will be removed soon use instead select_all_inputs 
 - get_first_key_input(_input)[source]¶
- get_first_key_input, return the annotation type [0]. - param _input
- list of annotations input for the Annotator 
- returns
- a list of annotations 
- rtype
- a list of annotations 
 - Deprecated since version 0.3: This function will be removed soon use instead select_first_input 
 - get_key_input(_input, i)[source]¶
- return a specific annotations type from key_input :param _input: key_input list :param i: the indice of the list to selecy :returns:a list of annotations :rtype:a list of annotation 
 
pymedextcore.bioctransform module¶
- class pymedextcore.bioctransform.BioC[source]¶
- Bases: - pymedextcore.datatransform.DataTransform- static load_collection(bioc_input: str, format: int = 0, is_file: bool = True)[source]¶
- load a bioc collection xml or json. It will return a list of Document object. - Parameters
- bioc_input – a str path to a bioc file or a bioc input string 
- format – xml or to_json type of the bioc file 
- is_file – if True bioc_input is path else it is a string 
 
- Returns
- list of Document 
 
 - static save_as_collection(list_of_pymedext_documents: List[pymedextcore.document.Document])[source]¶
- save a list of pymedext document as a bioc collection . It will return a bioc collection object. - Parameters
- list_of_pymedext_documents – a list of Document 
- Returns
- a bioc collection object 
 
 
pymedextcore.brat_parser module¶
- class pymedextcore.brat_parser.Attribute(id: str, type: str, target: str, values: Tuple[str, ...] = ())[source]¶
- Bases: - object- A simple attribute data structure. - id: str¶
 - target: str¶
 - type: str¶
 - values: Tuple[str, ...] = ()¶
 
- class pymedextcore.brat_parser.AugmentedEntity(id: str, type: str, span: Tuple[Tuple[int, int], ...], text: str, relations_from_me: Tuple[pymedextcore.brat_parser.Relation, ...], relations_to_me: Tuple[pymedextcore.brat_parser.Relation, ...], attributes: Tuple[pymedextcore.brat_parser.Attribute, ...])[source]¶
- Bases: - object- An augmented entity data structure with its relations and attributes. - attributes: Tuple[pymedextcore.brat_parser.Attribute, ...]¶
 - property end: int¶
 - id: str¶
 - relations_from_me: Tuple[pymedextcore.brat_parser.Relation, ...]¶
 - relations_to_me: Tuple[pymedextcore.brat_parser.Relation, ...]¶
 - span: Tuple[Tuple[int, int], ...]¶
 - property start: int¶
 - text: str¶
 - type: str¶
 
- class pymedextcore.brat_parser.Document(entities: List[pymedextcore.brat_parser.Entity], relations: List[pymedextcore.brat_parser.Relation], attributes: List[pymedextcore.brat_parser.Attribute])[source]¶
- Bases: - object- attributes: List[pymedextcore.brat_parser.Attribute]¶
 - entities: List[pymedextcore.brat_parser.Entity]¶
 - relations: List[pymedextcore.brat_parser.Relation]¶
 
- class pymedextcore.brat_parser.Entity(id: str, type: str, span: Tuple[Tuple[int, int], ...], text: str)[source]¶
- Bases: - object- A simple annotation data structure. - property end: int¶
 - id: str¶
 - span: Tuple[Tuple[int, int], ...]¶
 - property start: int¶
 - text: str¶
 - type: str¶
 
- class pymedextcore.brat_parser.Grouping(id: str, type: str, items: List[pymedextcore.brat_parser.Entity])[source]¶
- Bases: - object- id: str¶
 - items: List[pymedextcore.brat_parser.Entity]¶
 - property text¶
 - type: str¶
 
- class pymedextcore.brat_parser.Relation(id: str, type: str, subj: str, obj: str)[source]¶
- Bases: - object- A simple relation data structure. - id: str¶
 - obj: str¶
 - subj: str¶
 - type: str¶
 
- pymedextcore.brat_parser.get_augmented_entities(ann_path: str) → Dict[str, pymedextcore.brat_parser.AugmentedEntity][source]¶
- pymedextcore.brat_parser.get_entities_relations_attributes_groups(ann_path: str) → Tuple[Dict[str, pymedextcore.brat_parser.Entity], Dict[str, pymedextcore.brat_parser.Relation], Dict[str, pymedextcore.brat_parser.Attribute], Dict[str, pymedextcore.brat_parser.Grouping]][source]¶
- pymedextcore.brat_parser.parse(ann_path: str) → pymedextcore.brat_parser.Document[source]¶
- pymedextcore.brat_parser.parse_attribute(attribute_id: str, attribute_content: str) → pymedextcore.brat_parser.Attribute[source]¶
- Parse the annotation string into an Attribute structure. - Attribute_id : str The attribute ID in the annotation. (`A1 ` for example) 
- Attribute_content : str The attribute text content. (Tense T19 Past-Ended for example) 
 - Attribute An Attribute object 
 
- pymedextcore.brat_parser.parse_entity(tag_id: str, tag_content: str) → pymedextcore.brat_parser.Entity[source]¶
- Parse the entity string into an Entity structure. - tag_id : str The Tag ID in the annotation. (`T12 ` for example) 
- tag_content : str The tag text content. (Temporal-Modifier 116 126 history of for example) 
 - Entity An Entity object 
 
- pymedextcore.brat_parser.parse_relation(relation_id: str, relation_content: str) → pymedextcore.brat_parser.Relation[source]¶
- Parse the annotation string into a Relation structure. - relation_id : str The Relation ID in the annotation. (`R12 ` for example) 
- relation_content : str The relation text content. (`Modified-By Arg1:T8 Arg2:T6 ` for example) 
 - Relation A Relation object 
 
- pymedextcore.brat_parser.parse_string(annotation_string: str) → pymedextcore.brat_parser.Document[source]¶
- pymedextcore.brat_parser.parse_string_to_augmented_entities(annotation_string: str) → Dict[str, pymedextcore.brat_parser.AugmentedEntity][source]¶
- pymedextcore.brat_parser.read_file_annotations(ann: str) → Tuple[List[pymedextcore.brat_parser.Entity], List[pymedextcore.brat_parser.Relation], List[pymedextcore.brat_parser.Attribute]][source]¶
- Read an annotation file and get the Entities and Relations in it. - ann : str The path to the annotation file to be processed. 
 - Tuple[Set[Entity], Set[Relation], Set[Attribute]] A tuple of sets of Entities, Relations, and Attributes. 
 
pymedextcore.brattransform module¶
Created 2020/04/14
@author: David BAUDOIN
fonction : creation ou update d’un fichier BRAT a partir d’un dic pymedext
- class pymedextcore.brattransform.brat[source]¶
- Bases: - pymedextcore.datatransform.DataTransform- static load_from_brat(ann_file: str, txt_file: Optional[str] = None) → pymedextcore.document.Document[source]¶
- Load annotations from a .ann file in the Brat format - Parameters
- ann_file – path to the .ann file 
- txt_file – path to the corresponding .txt file, if None: defaults to replacing .ann by .txt 
 
- Returns
- Document 
- Return type
 
 - save_to_brat(folder_path: Optional[str] = None, pym_ann_types: Optional[List[str]] = None, brat_entities_in_pym_types: Optional[List[str]] = None, brat_entities_in_pym_types_value: Optional[List[str]] = None, brat_entities_in_pym_att_values: Optional[dict] = None, brat_entities_in_pym_att_keys: Optional[dict] = None, brat_attributes: Optional[dict] = None, pym_rel_types: Optional[List[str]] = None, brat_ents_of_rel_in_pym_rel_type: Optional[List[str]] = None, brat_ents_of_rel_in_pym_ent_value: Optional[List[str]] = None, brat_ents_of_rel_in_pym_att_values: Optional[dict] = None, brat_type_of_rel_in_pym_rel_types: Optional[List[str]] = None, brat_type_of_rel_in_pym_rel_att_values: Optional[dict] = None, level_annot: Optional[dict] = None)[source]¶
- This function will write all Annotations in Brat files at file_path. It will create (or overwrite) 2 files for each pymedext Documents in documents list input: - ID.ann: Brat annotation file (with ID = dic_pymedext.id) 
- ID.txt: Raw text of the document (with ID = dic_pymedext.id) 
 - It will create (or overwrite) an annotation.conf file. - param list_of_documents
- List of Documents input. Documents should contain same type of annotations 
- param folder_path
- path in string format. It will store files at this location. Folder needs to be created. 
 - For the other paramters, the extract of this pymedext document will be used in the examples, for a better understanding. - ‘’’ - {‘type’: ‘QuickUMLS’,
- ‘value’: ‘oesophagite’, ‘ngram’: None, ‘span’: (188, 199), ‘source’: ‘QuickUMLS:v1’, ‘source_ID’: ‘6814e9fa-96f7-11eb-a8c8-0242ac110002’, ‘isEntity’: False, ‘attributes’: {‘hypothesis’: ‘certain’, - ‘context’: ‘patient’, ‘negation’: ‘aff’, ‘cui’: ‘C0014868’, ‘label’: ‘oesophagite’, ‘semtypes’: [‘T047’], ‘score’: 1.0, ‘snippet’: ‘ La fibroscopie oeso-gastro-duodénale avait révélé une oesophagite peptique de grade II et a permis l’exérèse d’un petit papillome du tiers supérieur de l’œsophage’, ‘snippet_span’: (132, 296)}, 
 - ‘ID’: ‘681c2d82-96f7-11eb-a8c8-0242ac110002’},
- {‘type’: ‘regex’,
- ‘value’: ‘grade II’, ‘ngram’: None, ‘span’: (212, 220), ‘source’: ‘RegexMatcher:v1’, ‘source_ID’: ‘68155570-96f7-11eb-a8c8-0242ac110002’, ‘isEntity’: True, ‘attributes’: {‘version’: ‘v1’, - ‘label’: ‘Grade’, ‘id_regexp’: ‘id_grade’, ‘snippet’: ‘-gastro-duodénale avait révélé une oesophagite peptique de grade II et a permis l’exérèse d’un petit papillome du tiers supérie’, ‘hypothesis’: ‘certain’, ‘context’: ‘patient’, ‘negation’: ‘aff’}, - ‘ID’: ‘682ca3ec-96f7-11eb-a8c8-0242ac110002’}, 
 - ‘’’ - annotations : :param pym_ann_types: Pymedext types of annotation selected. exemple : [‘QuickUMLS’, ‘regex’] -> annotations in Brat will be about this two types of annotations. Depending on the different opitons filled (explained below), different labels will be displayed in brat. - :param brat_entities_in_pym_types : (optional) if brat entities correpond to annotation types in pymedext, this list should be filled. exemple : [‘regex’] -> in brat, for each regex found, ‘regex’ will be displayed. With the extract given ‘grade II’ will be highlighted in the text with the label ‘regex’. - :param brat_entities_in_pym_types_value : if brat entities correpond to the value of annotation types in pymedext, this list should be filled. exemple : [‘QuickUMLS’] -> in brat, for each QuickUMLS found, the quickumls annotation value will be displayed. With the extract given ‘oesophagite’ will be highlighted in the text with the label ‘QuickUMLS’. - :param brat_entities_in_pym_att_values : (optional) if brat entities correspond to annotation attributes values in pymexdext, this dict should be filled. Keys correponds to pymedext annotation type, values correspond to pymedext attributes keys. exemple : {‘regex’: ‘label’} -> in brat for each regex found, the regex label in attributes will be displayed. With the extract given ‘grade II’ will be highlighted in the text with the label ‘Grade’. - :param brat_entities_in_pym_att_keys : (optional) if brat entities correspond to annotation attributes keys in pymedext, this dict should be filled. Keys correponds to pymedext annotation type, values correspond to pymedext attributes keys. exemple : {‘regex’: ‘label’} -> in brat, for each regex found, the string “label” will be diplayed. With the extract given ‘grade II’ will be highlighted in the text with the label ‘label’. - param brat_attributes
- (optional) Dict with pymedext annotation type as keys, and the correspondant attributes list that should be exported as Brat attributes. 
 - exemple : {“QuickUMLS”: [‘hypothesis’, ‘negation’, ‘context’] -> for each quickumls found, hypothesis, negation and context attribute values will be displayed. Put “all” as value if you want all the attributes for this annotation type exemple :{“QuickUMLS”: “all”} for each QuickUMLS found, all attributes (semType, CUI code, hypothesis,… will be displayed.) - relations : :param pym_rel_types: Pymedext types of relation selected. exemple : [‘Stanza’] -> relations in Brat will be about this two types of relations. Depending on the different opitons filled (explained below), different labels will be displayed in brat. - :param brat_ents_of_rel_in_pym_rel_type : (optional) if brat entities of relations correpond to relations types in pymedext, this list should be filled. - :param brat_ents_of_rel_in_pym_ent_value : (optional) if brat entities of relations correpond to relations types in pymedext, this list should be filled. - return
- 1 
 
 
 
pymedextcore.connector module¶
- class pymedextcore.connector.APIConnector(baseurl: str, username: str, password: str)[source]¶
- Bases: - pymedextcore.connector._Router- Largely inspired of https://github.com/doccano/doccano-client.git work - Pour l’instant copy de la classe DoccanoClient dans doccano_api_client.py : - TODO: investigate alternatives to plaintext login - Args:
- baseurl (str): The baseurl of a Doccano instance. username (str): The Doccano username to use for the client session. password (str): The respective username’s password. 
- Returns:
- An authorized client instance. 
 
- class pymedextcore.connector.Connector[source]¶
- Bases: - object- TODO : make this an abstract class for other connector 
- class pymedextcore.connector.DatabaseConnector(DB_host, DB_name, DB_port, DB_user, DB_password)[source]¶
- Bases: - object- Abstract class specialize in database connection 
- class pymedextcore.connector.PostGresConnector(DB_host, DB_name, DB_port, DB_user, DB_password)[source]¶
- Bases: - pymedextcore.connector.DatabaseConnector- Abstract Connector to a Postgres Database 
- class pymedextcore.connector.SSHConnector(scp_host, scp_user, scp_password)[source]¶
- Bases: - object- TODO: implement a connection to a server with paramiko, should also extend Connector 
- class pymedextcore.connector.SimpleAPIConnector(host)[source]¶
- Bases: - object- TODO: implement a connection to a server with paramiko, should also extend Connector @David? 
- class pymedextcore.connector.cxORacleConnector(DB_host, DB_name, DB_port, DB_user, DB_password)[source]¶
- Bases: - pymedextcore.connector.DatabaseConnector- Abstact connector to an Oracle database using cxOracle 
pymedextcore.datatransform module¶
Each class which transform pymedext Document to another format must herit from the DataTransform
TODO: put some function such as save and load as mandatory to ease the use of DataTransform object
pymedextcore.doccanoannotator module¶
pymedextcore.doccanodocument module¶
- class pymedextcore.doccanodocument.DoccanoDocument[source]¶
- Bases: - object- DoccanoDocument is used to build an evaluation document, that will be sent to Doccano interface. DoccanoDocument contains a set of specific DoccanoAnnotation objects that a user want to evaluate. 
pymedextcore.doccanosource module¶
- class pymedextcore.doccanosource.DoccanoSource(baseurl, username, password)[source]¶
- Bases: - pymedextcore.source.Source,- pymedextcore.connector.APIConnector- Connection to DoccanoClient - This code is largely inspired of https://github.com/doccano/doccano-client.git work - create_label(project_id: str, label_name: str, color: str, prefix: str, suffix: str) → requests.models.Response[source]¶
- Adds a label to an existing project :param self: DoccanoClient :param project_id: the project id :param label_name: the text of the label :return: 
 - create_project(name: str, description: str, project_type: str, guidelines: str) → requests.models.Response[source]¶
- Creats a project :param name: name of the project :param description: description of the project :param project_type: type of project (“SequenceLabeling”, “DocumentClassification” or “Seq2seq” :return: 
 - find_project_id(regex: str, date: str, time: str)[source]¶
- Finds project id with a item (specific to scanner-covid project). If item is not enough to find the project id, date and time can be used. :param regex: item of interest :param date: date of the project :param time: time of the project :return: a project id 
 - get_annotation_detail(project_id: int, doc_id: int, annotation_id: int) → requests.models.Response[source]¶
 - get_annotation_list(project_id: int, doc_id: int) → requests.models.Response[source]¶
- Gets a list of annotations in a given project and document. - Args:
- project_id (int): A project ID to query. doc_id (int): A document ID to query. 
 - Returns
- requests.models.Response: The request response. 
 
 - get_document_detail(project_id: int, doc_id: int) → requests.models.Response[source]¶
- Gets details of a given document. - Args:
- project_id (int): A project ID to query. doc_id (int): A document ID to query. 
 - Returns
- requests.models.Response: The request response. 
 
 - get_document_list(project_id: int, url_parameters: dict = {}) → requests.models.Response[source]¶
- Gets a list of documents in a project. - Args:
- project_id (int): url_parameters (dict): limit and offset 
 - Returns
- requests.models.Response: The request response. 
 
 - get_features() → requests.models.Response[source]¶
- Gets features. - Returns
- requests.models.Response: The request response. 
 
 - get_label_detail(project_id: int, label_id: int) → requests.models.Response[source]¶
- Gets details of a specific label. - Args:
- project_id (int): A project ID to query. label_id (int): A label ID to query. 
 - Returns
- requests.models.Response: The request response. 
 
 - get_label_id(project_id: int, label_name: str)[source]¶
- Get the label id with the label name :param project_id: id of the project :param label_name: text of the label :return: id of the label 
 - get_label_list(project_id: int) → requests.models.Response[source]¶
- Gets a list of labels in a given project. - Args:
- project_id (int): A project ID to query. 
 - Returns
- requests.models.Response: The request response. 
 
 - get_me() → requests.models.Response[source]¶
- Gets this account information. :return: requests.models.Response: The request response. 
 - get_project_detail(project_id: int) → requests.models.Response[source]¶
- Gets details of a specific project. - Args:
- project_id (int): A project ID to query. 
 - Returns
- requests.models.Response: The request response. 
 
 - get_project_id(project_name: str)[source]¶
- Get the project id with the project name :param self: :param project_name: :return: the project id 
 - get_project_list() → requests.models.Response[source]¶
- Gets projects list. - Return:requests.models.Response
- The request response. 
 
 - get_project_statistics(project_id: int) → requests.models.Response[source]¶
- Gets project statistics. - Args:
- project_id (int): A project ID to query. 
 - Returns
- requests.models.Response: The request response. 
 
 - get_rolemapping_detail(project_id: int, rolemapping_id: int) → requests.models.Response[source]¶
- Currently broken! 
 - get_roles() → requests.models.Response[source]¶
- Gets available Doccano user roles. - Returns
- requests.models.Response: The request response. 
 
 - get_user_id(username: str)[source]¶
- Get the user id with the username :param self: :param username: :return: the userid 
 - get_user_list() → requests.models.Response[source]¶
- Gets user list. - Returns
- requests.models.Response: The request response. 
 
 - post_doc_upload(project_id: int, file_format: str, file_name: str, file_path: str = './') → requests.models.Response[source]¶
- Uploads a file to a Doccano project. - Args:
- project_id (int): The project id number. file_format (str): The file format, ex: plain, json, or conll. file_name (str): The name of the file. file_path (str): The parent path of the file. Defaults to ./. 
 - Returns
- requests.models.Response: The request response. 
 
 - set_rolemapping_list(project_id: str, user_id: str, role_id: str, username: str, rolename: str) → requests.models.Response[source]¶
- Set users roles :param self: DoccanoClient :param project_id: :param user_id: :param role_id: :param username: :param rolename: :return: requests.models.Response: The request response. 
 
pymedextcore.doccanotransform module¶
- class pymedextcore.doccanotransform.Doccano[source]¶
- Bases: - pymedextcore.datatransform.DataTransform- This class defines a set of transformation methods to build a DoccanoDocument with several pymedext Document objects. A doccanoDocument contains N doccanoAnnotations, that the user want to evaluate in Doccano interface. - Here the transoformation methods are specific to scanner report extractions, and DrWH negation, hypothesis and family context detections. Other transformation methods could be defined according to what the user want to evaluate. - DoccanoEvalClass(dict_doccano, dictClasses, number_eval, path_to_doc)[source]¶
- Adds doccano annotations to DoccanoDocument object until a specified number of evaluations for both classes. - Parameters
- dict_doccano – A doccano dict that will be filled until the two classes reach the desired number of annotations 
- dictClasses – A dict of doccano classes (ex : negatif vs non negatif) with their current occurences. 
- number_eval – the number of evaluations desired 
 
- Returns
- A list with the modified Doccano Object, and a dict of annotations classes, with their number 
- Return type
- list 
 
 - DoccanoEvalN(dict_doccano, number_annoted, number_eval, path_to_doc)[source]¶
- Adds DoccanoAnnotation objects to DoccaDocument until a specified number of evaluations. - Parameters
- dict_doccano – A doccano dict that will be filled until it reachs the desired number of annotations. 
- number_annoted – the number of evaluations desired. 
- number_eval – the current number of annotation. 
 
- Returns
- A list with the modified DoccanoDocument object, and the number of annotations 
- Return type
- list 
 
 - DoccanoEvalRappel(dict_doccano, path_to_doc)[source]¶
- Adds DoccanoAnnotation objects to a DoccanoDocument object, with a dict created with toDoccanoImaRappel - Parameters
- dict_doccano – a dict created with toDoccanoImaRappel method 
- path_to_doc – the path of the pymedext doc that was used to create dict_doccano 
 
- Returns
- DoccanoDocument object 
- Return type
 
 - docForDoccano()[source]¶
- Creats a DoccanoDoc object with dict in input - Returns
- DoccanoDocument object 
 - :rtype : DoccanoDocument 
 - toDoccanoDrWH(type, segment)[source]¶
- Specific method for DrWH evaluation Selects the drwh annotation and their value and creat a dict with syntagm/sentence as key and class value of the syntagm/sentence as value. - Ex : Extract of a pymedextDocument - {
- “type”: “drwh_syntagms”, “value”: ” Le patient présente un diabète de type II”, “span”: [ - 47, 91 - ], “source”: “DRWH_syntagms.v1”, “source_ID”: “74633e84-80a3-11ea-a7f6-180f76073bf2”, “isEntity”: false, “attributes”: null, “id”: “74633e89-80a3-11ea-a7f6-180f76073bf2” 
 - {
- “type”: “drwh_negation”, “value”: “non negatif”, “span”: [ - 47, 91 - ], “source”: “DRWH_negation.v1”, “source_ID”: “74633e89-80a3-11ea-a7f6-180f76073bf2”, “isEntity”: false, “attributes”: null, “id”: “74633e95-80a3-11ea-a7f6-180f76073bf2” 
 - An extract of pymedextDocument.toDoccanoDrWH(type=dwh_negation, segment=dwh_syntagm) return will be: - {…,”Le patient présente un diabète de type II”=”non negatif”…,} - Parameters
- type – dwh type of class (“dwh_negation” or “dwh_hypothesis” or “dwh_family”) 
- segment – syntagm or sentence (“dwh_sentence” or “dwh_syntagm”) 
 
- Returns
- a dict with syntagm or sentence as key and class as value 
- Return type
- dict 
 
 - toDoccanoImaPrecision(type, attribute=None)[source]¶
- Specific method for scanner extractor evaluation Selects the value extracted of the desired item in the pymedext Document. Returns a dict with the context as key, and the value extracted as value. - Ex : Extract of a pymedextDocument : ” … - {
- “type”: “motif”, “value”: “non évocateur”, “span”: [ - 2026, 2039 - ], “source”: “annotator_section_img”, “source_ID”: “eae2fd1e-8096-11ea-9260-e470b8d2ff7c”, “isEntity”: false, “attributes”: “ISCOVID”, “id”: “eae2fd1c-8096-11ea-b180-e470b8d2ff7c” 
 - “ - An extract of pymedextDocument.toDoccanoImaPrecision(type=”motif”, attribute=”ISCOVID”) return will be : - {…,<raw_text[2026,2039]> : “non évocateur”,…} - where raw_text[2026,2039] is the context of the extraction, a short extract of the report text around the extraction. - Parameters
- type – the type of the item (“rubrique” or “motif”) 
- attribute – item of interest (ex : “ISCOVID”) 
 
- Returns
- a dict with litteral context as key and value extracted as value 
- Return type
- dict 
 
 - toDoccanoImaRappel(dict_regexp_type, value, span)[source]¶
- Specific method for scanner extractor evaluation Founds the absent item in a pymedext Document. - Parameters
- dict_regexp_type – a dict of with the item as key (ex: “ISCOVID”) and their type as value (“motif” or “rubrique”) 
- value – “Null” 
 
- Returns
- A dict with the report text as key and a list of absent item as value 
- Return type
- dict 
 
 - toDoccanoPb(path_to_doc, regexp)[source]¶
- Specific method for scanner extractor evaluation A specific format to display documents that were annoted with label “problem” in Doccano - Parameters
- path_to_doc – the path to the doc that was annoted with problem label in Doccano 
- regexp – the item of interest 
 
- Returns
- a dict with the text report as key and the regex as value 
 
 
pymedextcore.document module¶
- class pymedextcore.document.Document(raw_text, ID=None, attributes=None, source=None, pathToconfig=None, documentDate=None)[source]¶
- Bases: - object- Document is the main class of pymedext. It is use to load file and annotate them with annotators - annotate(annotator)[source]¶
- Main function to annotate Document - Parameters
- annotator – annotators list 
- Returns
- run _annotate which add annotations to Document 
- Return type
 
 - static from_dict(d)[source]¶
- Create a Document from a dict of document (as created using to_dict) :param d: Dict :returns: Document :rtype: Document 
 - get_annotations(_type, source_id=None, target_id=None, attributes=None, value=None, span=None)[source]¶
- returns an annotations of a specific type from source. Can filter from type, source_id or target_id, span, source_id, attributes and value. :param _type: annotation type :param source_id: annotation source id :param target_id: annotation target id :param attributes: :param value: :param span: :return: 
 - get_relations(_type=None, head_id=None, target_id=None)[source]¶
- returns relations of a specific type from source. Can filter from type, head_id or target_id. :param _type: annotation type :param head_id: annotation source id :param target_id: annotation target id :return: 
 - load_annotations_files(pathToconfig)[source]¶
- Transform json Pymedext to Document - Parameters
- pathToconfig – list of path to json files, 
- Returns
- add annotations to Document 
- Return type
 
 - to_dict()[source]¶
- transform Document to dict PyMedExt TODO: Need to add the Document Date if available, the processing date, the annotators used - Returns
- json PyMedExt 
- Return type
- dict 
 
 
pymedextcore.ncbisource module¶
- class pymedextcore.ncbisource.PubTatorSource(host='https://www.ncbi.nlm.nih.gov/research/pubtator-api/publications/export/')[source]¶
- Bases: - pymedextcore.source.Source,- pymedextcore.connector.SimpleAPIConnector- Connection to PubTator api currently https://www.ncbi.nlm.nih.gov/research/pubtator-api/publications/export/ - get_pmids_annotations(pmid_list, Bioconcept='', returnFormat=0)[source]¶
- Return a set of pmid articles from PubTator - Parameters
- pmid_list – a list of articles pmid 
- Bioconcept – Default (leave it blank) includes all bioconcepts. Otherwise, user can choose 
 
 - gene, disease, chemical, species, proteinmutation, dnamutation, snp, and cellline. :param returnFormat: 0 return a PyMedExt Document, 1 Return a Bioc Document :returns: PyMedext Document or Bioc Document :rtype: 
 
pymedextcore.omopsource module¶
- class pymedextcore.omopsource.OmopSource(DB_host, DB_name, DB_port, DB_user, DB_password)[source]¶
- Bases: - pymedextcore.source.Source,- pymedextcore.connector.PostGresConnector- Connection to a POstgres Ommop source 
pymedextcore.omoptransform module¶
- class pymedextcore.omoptransform.omop[source]¶
- Bases: - pymedextcore.datatransform.DataTransform- buildNoteNlP(dict_note, note_id, note_nlp_id, nlp_workflow, thisTime, filterType, dataframe=False)[source]¶
 - generateNote(to_omop_note, to_date, note_event_id, note_event_field_concept_id, note_type_concept_id, note_class_concept_id, note_title, encoding_concept_id, language_concept_id, provider_id, visit_detail_id, note_source_value)[source]¶
 - generateNoteNLP(to_omop_nlp, to_date, note_nlp_id, note_id, section_concept_id, note_nlp_concept_id, note_nlp_source_concept_id, nlp_workflow, term_exist, entity)[source]¶
 - generatePerson(to_omop_person, gender_concept_id, year_of_birth, month_of_birth, day_of_birth, birth_datetime, death_datetime, race_concept_id, ethnicity_concept_id, location_id, provider_id, care_site_id, person_source_value, gender_source_value, gender_source_concept_id, race_source_value, race_source_concept_id, ethnicity_source_value, ethnicity_source_concept_id)[source]¶