This package exposes model (mostly transformers and classifiers) that apply complex transformations. These are to be leveraged by members of Dataset package and are directly applied on members of Sample class (or on built-in objects).
The examples related to this package can be found at model notebook.
CPE records to existing vulnerabilities is handled by Dataset class,
However, come CVEs are missed due to omitted vulnerable configurations in CVEDataset class. We omit configurations that comprise of two components joined with
AND operator. For closer description, see issue #252 at GitHub.
- class sec_certs.model.CPEClassifier(match_threshold=80, n_max_matches=10, spacy_model_to_use='en_core_web_sm')#
Class that can predict CPE matches for certificate instances. Adheres to sklearn sklearn.base.BaseEstimator interface. Fit method is called on list of CPEs and build two look-up dictionaries, see description of attributes.
- fit(X, y=None)#
Just creates look-up structures from provided list of CPEs
X (List[CPE]) – List of CPEs that can be matched with predict()
y (Optional[List[str]]) – will be ignored, specified to adhere to sklearn BaseEstimator interface, defaults to None
- Return CPEClassifier:
return self to allow method chaining
Will predict CPE uris for List of Tuples (vendor, product name, identified versions in product name)
X (List[Tuple[str, str, str]]) – tuples (vendor, product name, identified versions in product name)
- Return List[Optional[Set[str]]]:
List of CPE uris that correspond to given input, None if nothing was found.
- predict_single_cert(vendor, product_name, versions, relax_version=False, relax_title=False)#
Predict List of CPE uris for triplet (vendor, product_name, list_of_versions). The prediction is made as follows: 1. Sanitize vendor name, lemmatize product name. 2. Find vendors in CPE dataset that are related to the certificate 3. Based on (vendors, versions) find all CPE items that are considered as candidates for match 4. Compute string similarity of the candidate CPE matches and certificate name 5. Evaluate best string similarity, if above threshold, declare it a match. 6. If no CPE item is matched, try again but relax version and check CPEs that don’t have their version specified. 7. (Also, search for 100% CPE matches on item name instead of title.)
vendor (Optional[str]) – manufacturer of the certificate
product_name (str) – name of the certificate
versions (Set[str]) – List of versions that appear in the certificate name
relax_version (bool) – See step 6 above., defaults to False
relax_title (bool) – See step 7 above, defaults to False
- Return Optional[Set[str]]:
Set of matching CPE uris, None if no matches found
- class sec_certs.model.SARTransformer#
Class for transforming SARs defined in st_keywords and report_keywords dictionaries into SAR objects. This class implements sklearn.base.Transformer interface, so fit_transform() can be called on it.
Just returns self, no fitting needed
certificates (Iterable[CCCertificate]) – Unused parameter
- Return SARTransformer:
Just a wrapper around transform_single_cert() called on an iterable of CCCertificate.
certificates (Iterable[CCCertificate]) – Iterable of CCCertificate objects to perform the extraction on.
- Return List[Optional[Set[SAR]]]:
Returns List of results from transform_single_cert().
Given CCCertificate, will transform SAR keywords extracted from txt files into a set of SAR objects. Also handles extractin of correct SAR levels, duplicities and filtering. Uses three sources: CSV scan, security target, and certification report. The caller should assure that the certificates have the keywords extracted.
cert (CCCertificate) – Certificate to extract SARs from
- Return Optional[Set[SAR]]:
Set of SARs, None if none were identified.
- class sec_certs.model.ReferenceFinder#
The class assigns references of other certificate instances for each instance. Adheres to sklearn BaseEstimator interface. The fit is called on a dictionary of certificates, builds a hashmap of references, and assigns references for each certificate in the dictionary.
- property duplicates#
Get the duplicates in the fitted dataset.
- Return IDMapping:
Mapping of certificate ID to digests that share it.
- fit(certificates, id_func, ref_lookup_func)#
Builds a list of references and assigns references for each certificate instance.
certificates (Certificates) – dictionary of certificates with hashes as key
id_func (IDLookupFunc) – lookup function for cert id
ref_lookup_func (ReferenceLookupFunc) – lookup for references
- predict(dgst_list, keep_unknowns=True)#
Get the references for a list of certificate digests.
dgst_list – List of certificate digests.
keep_unknowns – Whether to keep references to and from unknown certificate IDs
- Return Dict[str, References]:
Dict with certificate hash and References object.
- predict_single_cert(dgst, keep_unknowns=True)#
Get the references object for specified certificate digest.
dgst – certificate digest
keep_unknowns – Whether to keep references to unknown certificate IDs
- Return References:
- property unknown_references#
Get the unknown references in the fitted dataset (to unknown certificate IDs, not in the dataset during fit).
- class sec_certs.model.TransitiveVulnerabilityFinder(id_func)#
The class assigns vulnerabilities to each certificate instance caused by references among certificate instances. Adheres to sklearn BaseEstimator interface.
- fit(certificates, ref_func)#
Method assigns each certificate vulnerabilities caused by references among certificates
certificates (Certificates) – Dictionary of certificates with digests
- Return Vulnerabilities:
Dictionary of vulnerabilities of certificate instances
Method returns vulnerabilities for a list of certificate digests
dgst_list (List[str]) – list of certificate digests
- Return Dict[str, TransitiveCVE]:
Dictionary of TransitiveCVE objects for specified certificate digests
Method returns vulnerabilities for certificate digest
dgst (str) – Digest of certificate
- Return TransitiveCVE:
TransitiveCVE object of certificate