3. API Documentation

3.1. Main Module

3.2. Application

class mecoshark.mecosharkapp.MecoSHARK(input_path, output, project_name, revision, url, makefile_contents, db_name, db_host, db_port, db_user, db_password, db_authentication, debug_level, ssl_enabled)[source]

Main app for the mecoshark plugin

__init__(input_path, output, project_name, revision, url, makefile_contents, db_name, db_host, db_port, db_user, db_password, db_authentication, debug_level, ssl_enabled)[source]

Main runner of the mecoshark app

Parameters
  • input – path to the revision that is used as input

  • output – path to an output directory, where files can be stored

  • project_name

  • revision – string of the revision hash

  • url – url of the project that is analyzed

  • makefile_contents – contents of the makefile (e.g., for the c processor)

  • db_name – name of the database

  • db_host – name of the host where the mongodb is running

  • db_port – port on which the mongodb listens on

  • db_user – username of the mongodb user

  • db_password – password for the mongodb user

  • db_authentication – name of the database that is used as authentication

  • debug_level – debug level like defined in logging

Warning

URL must be the same as the url that was stored in the mongodb by vcsSHARK!

__weakref__

list of weak references to the object (if defined)

detect_languages()[source]

Detects programming languages used in the input path

process_revision()[source]

Processes a revision. First the language is detected, that the system uses, after that the correct processors are found, which can be used for this language and the process method is called.

static sanitize_sloccount_output(output)[source]

Method that sanitizes the sloccount output (because we read it directly from the command line)

Parameters

output – ouput that must be sanitized

3.3. Processor

3.3.1. Base Processor

class mecoshark.processor.baseprocessor.BaseProcessor(output_path, input_path)[source]

Main app for the mecoshark plugin

Parameters
  • output_path – path to an output directory, where files can be stored

  • input_path – path to the revision that is used as input

Property input_path

path to the revisionn that is used as input

Property output_path

path to an output directory, where files can be stored

Property projectname

name of the project (last part of input path)

__init__(output_path, input_path)[source]

Initialize self. See help(type(self)) for accurate signature.

__weakref__

list of weak references to the object (if defined)

abstract property enabled

Trigger to enable/disable plugins

Returns

boolean

prepare_template(template)[source]

Copies the template from the template folder to the output_path and sets access rights.

Several variables (marked with $<name>) are substituted, so that the template can be used right away

Parameters

template – path to the template

Returns

abstract process(project_name, revision, url, options, debug_level)[source]

Is called if a revision with hash “revision” should be processed.

Parameters
  • project_name – name of the project

  • revision – revision_hash of the revision

  • url – url of the project that is analyzed

  • options – possible options (e.g. for CProcessor)

  • debug_level – debugging_level

Returns

abstract property supported_languages

Currently: java, c, cs, cpp, python :return:

abstract property threshold

Threshold on which the processor should be executed

Example

If threshold is 0.4, then the processor is executed if more than 40% of all files have the supported file type.

Returns

threshold

3.3.2. C Processor

class mecoshark.processor.cprocessor.CProcessor(output_path, input_path)[source]

Implements BaseProcessor for C-like languages

__init__(output_path, input_path)[source]

Initialize self. See help(type(self)) for accurate signature.

property enabled

See: enabled()

execute_sourcemeter(makefile_contents=None)[source]

Executes sourcemeter with the given makefile_contents

Parameters

makefile_contents – makefile_contents for execution

is_output_produced()[source]

Checks if output was produced for the process

Returns

boolean

process(revision, url, makefile_contents, debug_level)[source]

See: process()

Processes the given revision. 1) executes sourcemeter 2) creates SourcemeterParser instance 3) calls store_data()

Parameters
  • revision – revision

  • url – url of the project that is analyzed

  • makefile_contents – makefile_contents for execution

  • debug_level – debugging_level

property supported_languages

See: supported_languages()

property threshold

See: threshold()

3.3.3. Java Processor

class mecoshark.processor.javaprocessor.JavaProcessor(output_path, input_path)[source]

Implements BaseProcessor for Java

__init__(output_path, input_path)[source]

Initialize self. See help(type(self)) for accurate signature.

property enabled

See: enabled()

execute_sourcemeter()[source]

Executes sourcemeter for the java language Currently, we just do a directory-based analysis

is_output_produced()[source]

Checks if output was produced for the process

Returns

boolean

process(project_name, revision, url, options, debug_level)[source]

See: process()

Processes the given revision. 1) executes sourcemeter 2) creates SourcemeterParser instance 3) calls store_data()

Parameters
  • project_name – name of the project

  • revision – revision

  • url – url of the project that is analyzed

  • options – options for execution

  • debug_level – debugging_level

property supported_languages

See: supported_languages()

property threshold

See: threshold()

3.3.4. Python Processor

class mecoshark.processor.pythonprocessor.PythonProcessor(output_path, input_path)[source]

Implements BaseProcessor for Python

__init__(output_path, input_path)[source]

Initialize self. See help(type(self)) for accurate signature.

property enabled

See: enabled()

execute_sourcemeter()[source]

Executes sourcemeter for a python project

is_output_produced()[source]

Checks if output was produced for the process

Returns

boolean

process(project_name, revision, url, options, debug_level)[source]

See: process()

Processes the given revision. 1) executes sourcemeter 2) creates SourcemeterParser instance 3) calls store_data()

Parameters
  • revision – revision

  • url – url of the project that is analyzed

  • options – options for execution

  • debug_level – debugging_level

property supported_languages

See: supported_languages()

property threshold

See: threshold()

3.4. Resultparser

3.4.1. SourceMeterParser

class mecoshark.resultparser.sourcemeterparser.SourcemeterParser(output_path, input_path, project_name, url, revision_hash, debug_level)[source]

Parser that parses the results from sourcemeter

Property output_path

path to an output directory, where files can be stored

Property input_path

path to the revisionn that is used as input

Property url

url to the repository of the project that is analyzed

Property vcs_system_id

id of the vcs_system with the given url

Property stored_files

list of files that are stored at the input path

Property ordered_file_states

dictionary that have all results in an ordered manner (a state that have another as parent must be after this parent state)

Property stored_file_states

states that were stored in the mongodb

Property stored_meta_package_states

meta package states that were stored in the mongodb

Property input_files

list of input files

Property commit_id

id of the commit for which the data should be stored. bson.objectid.ObjectId

__init__(output_path, input_path, project_name, url, revision_hash, debug_level)[source]

Initialization

Parameters
  • output_path – path to an output directory, where files were stored

  • input_path – path to the revision that is used as input

  • url – url to the repository of the project that is analyzed

  • revision_hash – hash of the revision, which is analyzed

  • debug_level – debug level, like defined in logging

__weakref__

list of weak references to the object (if defined)

find_stored_files()[source]

We need to find all files that are stored in the input path. This is needed to link the files that were parsed with the files that are already stored via vcsSHARK. :return: dictionary with file path as key and id as value (from vcsshark results)

get_commit_id(vcs_system_id)[source]

Gets the commit id for the corresponding projectid and revision :param vcs_system_id: id of the vcs system. bson.objectid.ObjectId

Returns

commit_id (bson.objectid.ObjectId)

get_component_ids(row_component_ids)[source]

Function that gets the component ids from the component ids string.

Parameters

row_component_ids – component ids string

Returns

ObjectIds of all components as list (bson.objectid.ObjectId)

static get_csv_file(path)[source]

Return a filepath or none if nothing is found.

Parameters

path – path to file (regex)

Returns

filepath or none

get_fullpath(long_name)[source]

If the long_name is in the input files of the input path, it will return the corresponding file name

Parameters

long_name – long_name of the row

Returns

new long_name

get_vcs_system_id()[source]

Gets the project id for the given url :param url: url of the vcs_system

Returns

vcs_system_id (bson.objectid.ObjectId)

prepare_csv_files()[source]

Prepares the csv files generated by SourceMeter by creating a sort key and sort it after it

sanitize_long_name(orig_long_name)[source]

Sanitizes the long_name of the row. 1) If the long_name has the input path in it: just strip it 2) If the long_name has the output path in it: just strip it 3) Otherwise: The long_name will be separated by “/” and joined together after the first part was split.

Parameters

orig_long_name – long_name of the row

Returns

sanitized long_name

Note

This is necessary, as the output of sourcemeter can be different based on which processor is used.

static sanitize_metrics_dictionary(metrics)[source]

Helper function, which sanitizes the csv reader row so that it only contains the metrics of it.

Parameters

metrics – csv reader row

Returns

dictionary of metrics

static sort_for_parent(state_dict)[source]

Sorts the given dictionary in a way, that the parent states of the states must be before it. Special rules apply for file states, as they do not have any parents.

Parameters

state_dict – dictionary of states that should be ordered

Returns

ordered dictionary

Note

Example: X has parent Y, Y has parent Z. Therefore, it would be ordered: Z -> Y -> X

store_clone_data()[source]

Parses and stores the cloning data that was generated by sourcemeter.

store_data()[source]

Call to store data: If they have ‘Path’ in the row, file states data is stored. Otherwise, meta package data

Returns

store_extra_data()[source]

Call to store extra data. For java this would be the PMD file, for C/C++ the cppcheck file, and for python the pylint file. :return:

store_file_states_data(row)[source]

Stores the file states data. Fills the stored_file_states property for less database communication.

Parameters

row – row that is processed:

Note

File states have a direct connection to a file from a revision.

store_meta_package_data(row)[source]

Stores the meta package data. Fills the stored_meta_package_states property for less database communication.

Parameters

row – row that is processed

Note

Meta packages do not have a direct connection to files from a revision. It consists of a set of states.