def cppcheck_ast(sourcefile): subprocess.call([CPPCHECK, '--dump', '--max-configs=1', sourcefile]) data = cppcheckdata.parsedump(sourcefile + '.dump') cfg = data.configurations[0] ret = [] for func in cfg.functions: name = func.name if func.type == 'Destructor': name = '~' + name s = '<Function' s = s + ' name="' + name + '"' s = s + ' filename="' + func.tokenDef.file + '"' s = s + ' line="' + str(func.tokenDef.linenr) + '"' s = s + '/>' ret.append(s) for scope in cfg.scopes: if scope.type != 'Function': continue argStart = scope.bodyStart while argStart and argStart.str != '(': argStart = argStart.previous s = '<Function' s = s + ' name="' + scope.className + '"' s = s + ' filename="' + argStart.file + '"' s = s + ' line="' + str(argStart.linenr) + '"' s = s + '/>' if s not in ret: ret.append(s) ret.sort() return ret
def main(): print(sys.argv) function_calls = defaultdict(set) files_deps = defaultdict(set) func_decl_file = {} # type: Dict[str, str] for dirpath, _, files in walk('.'): for basename in files: file = os.path.join(dirpath, basename) if file[-2:] == '.c': print(file) call([CPP_CHECK, '--dump', file] + sys.argv[1:]) dump_file = file + '.dump' d = parsedump(dump_file) os.remove(dump_file) try: # functions = d.configurations[0].functions scopes = d.configurations[0].scopes tokens = d.configurations[0].tokenlist # type: List[Token] for scope in scopes: scope.tokens = [ t for t in tokens if t.scopeId == scope.Id ] for scope in scopes: if scope.type == 'Function': func_decl_file[scope.function.name] = file for t in scope.tokens: if t.function is not None: function_calls[scope.function.name].add( t.function.name) except Exception as e: print(e) # Dépendences entre fichiers .c for caller, callees in function_calls.items(): for callee in callees: if caller in func_decl_file and callee in func_decl_file: files_deps[func_decl_file[caller]].add(func_decl_file[callee]) pprint(function_calls, indent=2) pprint(func_decl_file, indent=2) pprint(files_deps, indent=2) dot_graph('calls', function_calls) dot_graph('files', files_deps) dot_graph('full', function_calls, reverse_dict(func_decl_file))
def main_run_collect(self, dump_file, source_file=''): ''' input: a cppcheck 'dump' file containing an Abstract Syntax Tree (AST), symbol table, and token list. returns: None side-effects: updates datbase with information about this unit analysis ''' self.source_file = source_file self.current_file_under_analysis = dump_file # PARSE INPUT data = cppcheckdata.parsedump(dump_file) analysis_unit_dict = {} # GIVE TREE WALKER ACCESS TO SOURCE FILE FOR DEBUG PRINT if self.source_file and os.path.exists(self.source_file): with open(self.source_file) as f: self.source_file_lines = f.readlines() #print "yes" else: print "no %s %s" % (self.source_file, self.debug) # for c in data.configurations: #todo: what is a data configuration? -- Check for multiple for c in data.configurations[: 1]: # MODIFIED TO ONLY TEST THE FIRST CONFIGURATION # ADD AST DECORATION PLACEHOLDERS c = self.init_cppcheck_config_data_structures(c) c = self.init_cppcheck_config_functions(c) # REFRESH VARIABLES self.function_graph = nx.DiGraph() # GET DICT OF ALL GLOBALLY SCOPED FUNCTIONS analysis_unit_dict = self.find_functions(c) sorted_analysis_unit_dict = analysis_unit_dict # WILL BECOME AN ORDERED DICT IF self.should_sort_by_function_graph # FIND ORDER FOR FUNCTION GRAPH EXPLORATION ( topo sort, if possible, otherwise todo ??) if self.should_sort_by_function_graph: self.build_function_graph( analysis_unit_dict) # WILL USE DAG SUBGRAPH sorted_analysis_unit_dict = self.make_sorted_analysis_unit_dict_from_function_graph( analysis_unit_dict) # RETURNS ORDERED DICT self.all_sorted_analysis_unit_dicts.append( sorted_analysis_unit_dict) # COLLECT ALL TOKEN PARSE TREES FOR EACH FUNCTION for function_dict in sorted_analysis_unit_dict.values(): self.collect_constraints(function_dict) if self.SHOULD_PRINT_CONSTRAINTS: self.print_all_computed_unit_constraints() self.print_all_df_constraints() self.print_all_conversion_factor_constraints() self.print_all_known_symbol_constraints() self.print_all_naming_constraints() self.configurations.append(c)
def process_cppcheck_dump_file(file): """ Process the dump file data. This processes variables and scopes configurations from cppcheckdata.parsedump. The variables configuration contains only variables. The scopes configuration contain the following types: type="Namespace", type="Function", type="Class", type="Struct", type="Enum" All scopes configuration types are checked except type="Namespace" """ print('Checking ' + file + '...') data = cppcheckdata.parsedump(file) for cfg in data.configurations: if len(data.configurations) > 1: print('Checking ' + file + ', config "' + cfg.name + '"...') # process variables configuration if RE_VARNAME: for var in cfg.variables: res = re.match(RE_VARNAME, var.nameToken.str) if not res: report_error(var.typeStartToken, 'style', 'Variable ' + var.nameToken.str + ' violates naming convention') # processes scope configuration type="Function" if RE_FUNCTIONNAME: classes = create_class_list(cfg.scopes) for scope in cfg.scopes: if scope.type == 'Function': # delete constructors and destructors from function check if scope.className in classes: continue res = re.match(RE_FUNCTIONNAME, scope.className) if not res: report_error(scope.classStart, 'style', 'Function ' + scope.className + ' violates naming convention') # process scope configuration type="Class", type="Struct", and type="Enum" if RE_CLASSNAME: for scope in cfg.scopes: if (scope.type == 'Class' or scope.type == 'Struct' or scope.type == 'Enum'): res = re.match(RE_CLASSNAME, scope.className) if not res: report_error(scope.classStart, 'style', scope.type + ' ' + scope.className + ' violates naming convention')
# parse command line args = parser.parse_args() # now operate on each file in turn dumpfiles = find_dump_files(args.paths) for dumpfile in dumpfiles: if not args.quiet: print('Checking ' + dumpfile + '...') srcfile = dumpfile.rstrip('.dump') # at the start of the check, we don't know if code is Y2038 safe y2038safe = False # load XML from .dump file data = cppcheckdata.parsedump(dumpfile) # go through each configuration for cfg in data.configurations: if not args.quiet: print('Checking ' + dumpfile + ', config "' + cfg.name + '"...') safe_ranges = [] safe = -1 time_bits_defined = False for directive in cfg.directives: # track source line number if directive.file == srcfile: srclinenr = directive.linenr # check for correct _TIME_BITS if present if re_define_time_bits_64.match(directive.str): time_bits_defined = True elif re_define_time_bits.match(directive.str):
def process(dumpfiles, configfile, debugprint=False): errors = [] conf = loadConfig(configfile) for afile in dumpfiles: if not afile[-5:] == '.dump': continue print('Checking ' + afile + '...') data = cppcheckdata.parsedump(afile) ## Check File naming if "RE_FILE" in conf and conf["RE_FILE"]: mockToken = dataStruct(afile[:-5], "0", afile[afile.rfind('/')+1:-5]) msgType = 'File name' for exp in conf["RE_FILE"]: evalExpr(conf["RE_FILE"], exp, mockToken, msgType, errors) ## Check Namespace naming if "RE_NAMESPACE" in conf and conf["RE_NAMESPACE"]: for tk in data.rawTokens: if (tk.str == 'namespace'): mockToken = dataStruct(tk.next.file, tk.next.linenr, tk.next.str) msgType = 'Namespace' for exp in conf["RE_NAMESPACE"]: evalExpr(conf["RE_NAMESPACE"], exp, mockToken, msgType, errors) for cfg in data.configurations: if len(data.configurations) > 1: print('Checking ' + afile + ', config "' + cfg.name + '"...') if "RE_VARNAME" in conf and conf["RE_VARNAME"]: for var in cfg.variables: if var.nameToken and var.access != 'Global' and var.access != 'Public' and var.access != 'Private': prev = var.nameToken.previous varType = prev.str while "*" in varType and len(varType.replace("*", "")) == 0: prev = prev.previous varType = prev.str + varType if debugprint: print("Variable Name: " + str(var.nameToken.str)) print("original Type Name: " + str(var.nameToken.valueType.originalTypeName)) print("Type Name: " + var.nameToken.valueType.type) print("Sign: " + str(var.nameToken.valueType.sign)) print("variable type: " + varType) print("\n") print("\t-- {} {}".format(varType, str(var.nameToken.str))) if conf["skip_one_char_variables"] and len(var.nameToken.str) == 1: continue if varType in conf["var_prefixes"]: if not var.nameToken.str.startswith(conf["var_prefixes"][varType]): errors.append(reportError( var.typeStartToken.file, var.typeStartToken.linenr, 'style', 'Variable ' + var.nameToken.str + ' violates naming convention')) mockToken = dataStruct(var.typeStartToken.file, var.typeStartToken.linenr, var.nameToken.str) msgType = 'Variable' for exp in conf["RE_VARNAME"]: evalExpr(conf["RE_VARNAME"], exp, mockToken, msgType, errors) ## Check Private Variable naming if "RE_PRIVATE_MEMBER_VARIABLE" in conf and conf["RE_PRIVATE_MEMBER_VARIABLE"]: # TODO: Not converted yet for var in cfg.variables: if (var.access is None) or var.access != 'Private': continue mockToken = dataStruct(var.typeStartToken.file, var.typeStartToken.linenr, var.nameToken.str) msgType = 'Private member variable' for exp in conf["RE_PRIVATE_MEMBER_VARIABLE"]: evalExpr(conf["RE_PRIVATE_MEMBER_VARIABLE"], exp, mockToken, msgType, errors) ## Check Public Member Variable naming if "RE_PUBLIC_MEMBER_VARIABLE" in conf and conf["RE_PUBLIC_MEMBER_VARIABLE"]: for var in cfg.variables: if (var.access is None) or var.access != 'Public': continue mockToken = dataStruct(var.typeStartToken.file, var.typeStartToken.linenr, var.nameToken.str) msgType = 'Public member variable' for exp in conf["RE_PUBLIC_MEMBER_VARIABLE"]: evalExpr(conf["RE_PUBLIC_MEMBER_VARIABLE"], exp, mockToken, msgType, errors) ## Check Global Variable naming if "RE_GLOBAL_VARNAME" in conf and conf["RE_GLOBAL_VARNAME"]: for var in cfg.variables: if (var.access is None) or var.access != 'Global': continue mockToken = dataStruct(var.typeStartToken.file, var.typeStartToken.linenr, var.nameToken.str) msgType = 'Public member variable' for exp in conf["RE_GLOBAL_VARNAME"]: evalExpr(conf["RE_GLOBAL_VARNAME"], exp, mockToken, msgType, errors) ## Check Functions naming if "RE_FUNCTIONNAME" in conf and conf["RE_FUNCTIONNAME"]: for token in cfg.tokenlist: if token.function: if token.function.type == 'Constructor' or token.function.type == 'Destructor': continue retval = token.previous.str prev = token.previous while "*" in retval and len(retval.replace("*", "")) == 0: prev = prev.previous retval = prev.str + retval if debugprint: print("\t:: {} {}".format(retval, token.function.name)) if retval and retval in conf["function_prefixes"]: if not token.function.name.startswith(conf["function_prefixes"][retval]): errors.append(reportError( token.file, token.linenr, 'style', 'Function ' + token.function.name + ' violates naming convention')) mockToken = dataStruct(token.file, token.linenr, token.function.name) msgType = 'Function' for exp in conf["RE_FUNCTIONNAME"]: evalExpr(conf["RE_FUNCTIONNAME"], exp, mockToken, msgType, errors) ## Check Class naming if "RE_CLASS_NAME" in conf and conf["RE_CLASS_NAME"]: for fnc in cfg.functions: #Check if it is Constructor/Destructor if (fnc.type == 'Constructor' or fnc.type == 'Destructor'): mockToken = dataStruct(fnc.tokenDef.file, fnc.tokenDef.linenr, fnc.name) msgType = 'Class ' + fnc.type for exp in conf["RE_CLASS_NAME"]: evalExpr(conf["RE_CLASS_NAME"], exp, mockToken, msgType, errors) return errors
def get_cppcheck_config_data_structure(self, dump_file): data = cppcheckdata.parsedump(dump_file) for c in data.configurations[:1]: return c
if __name__ == '__main__': args = get_args() if args.verify: VERIFY = True if not args.dumpfile: if not args.quiet: print("no input files.") sys.exit(0) for dumpfile in args.dumpfile: if not args.quiet: print('Checking %s...' % dumpfile) data = cppcheckdata.parsedump(dumpfile) if VERIFY: VERIFY_ACTUAL = [] VERIFY_EXPECTED = [] for tok in data.rawTokens: if tok.str.startswith('//') and 'TODO' not in tok.str: for word in tok.str[2:].split(' '): if re.match(r'cert-[A-Z][A-Z][A-Z][0-9][0-9].*', word): VERIFY_EXPECTED.append( str(tok.linenr) + ':' + word) for cfg in data.configurations: if (len(data.configurations) > 1) and (not args.quiet): print('Checking %s, config %s...' % (dumpfile, cfg.name)) exp05(cfg)
def process_dump_file(file): """ Process the dump file data. This processes variables and scopes configurations from cppcheckdata.parsedump. The variables configuration contains only variables. The scopes configuration contain the following types: type="Namespace", type="Function", type="Class", type="Struct", type="Enum" All scopes configuration types are checked except type="Namespace" This function is adapted from the cppcheck naming.py file. """ variable_names = 0 function_names = 0 class_names = 0 filepath = __astyle_src_dir + file data = cppcheckdata.parsedump(filepath) for cfg in data.configurations: if len(data.configurations) > 1: print('Checking ' + file + ', config "' + cfg.name + '"...') # process variables configuration if RE_VARNAME: for var in cfg.variables: # the following 'if' from version 1.88 corrects a bug # python aborted when there is no variable name in a function argument if var.nameToken: variable_names += 1 res = re.match(RE_VARNAME, var.nameToken.str) if not res: report_error( var.typeStartToken, 'style', 'Variable ' + var.nameToken.str + ' violates naming convention') # processes scope configuration type="Function" if RE_FUNCTIONNAME: # get class names to eliminate constructors and destructors class_list = create_class_list(cfg.scopes) for scope in cfg.scopes: if scope.type == 'Function': # delete constructors and destructors from function check # they have class names not function names if scope.className in class_list: continue function_names += 1 res = re.match(RE_FUNCTIONNAME, scope.className) if not res: report_error( scope.bodyStart, 'style', 'Function ' + scope.className + ' violates naming convention') # process scope configuration type="Class", type="Struct", and type="Enum" if RE_CLASSNAME: for scope in cfg.scopes: if scope.type in ('Class', 'Struct', 'Enum'): class_names += 1 res = re.match(RE_CLASSNAME, scope.className) if not res: report_error( scope.bodyStart, 'style', scope.type + ' ' + scope.className + ' violates naming convention') else: # verify scope types if scope.type in ('Global', 'Namespace', 'Function'): continue if scope.type in ('If', 'Else', 'While', 'For', 'Do', 'Switch'): continue print('Unrecognized scope.type: ' + scope.type) # end of top 'for' loop print('variables:', variable_names, ' functions:', function_names, ' classes:', class_names)
def run_checks(self): num_raw_tokens = 0 # Remove duplicates from dump file list self.args.dumpfile = list(dict.fromkeys(self.args.dumpfile)) # Run metric checks for each dump file for dumpfile in self.args.dumpfile: if not self.args.quiet: printf("Checking %s...\n", dumpfile) self.statistics_list.append(dumpfile) data = cppcheckdata.parsedump(dumpfile) if self.args.verify: for token in data.rawTokens[num_raw_tokens:]: if token.str.startswith('//') and 'TODO' not in token.str: for word in token.str[2:].split(' '): if word.startswith("HIS-"): self.verify_expected.append(token.file + ':' + str(token.linenr) + ':' + word) cfg_idx = 0 for cfg in data.configurations: if (cfg_idx < 1): self.execute_metric_check("COMF", self.his_comf, cfg, data.rawTokens[num_raw_tokens:]) self.execute_metric_check("PATH", self.his_path, cfg) self.execute_metric_check("GOTO", self.his_goto, cfg) self.execute_metric_check("STCYC", self.his_stcyc, cfg) self.execute_metric_check("CALLING", self.his_calling, cfg) self.execute_metric_check("CALLS", self.his_calls, cfg) self.execute_metric_check("PARAM", self.his_param, cfg) self.execute_metric_check("STMT", self.his_stmt, cfg) self.execute_metric_check("LEVEL", self.his_level, cfg) self.execute_metric_check("RETURN", self.his_return, cfg) self.execute_metric_check("VOCF", self.his_vocf, cfg) cfg_idx = cfg_idx + 1 # Since Cppcheck 2.4 rawTokens has been moved from class to instance level. # It will be initialized for each dump file analysis. if 'rawTokens' not in data.__dict__: num_raw_tokens = len(data.rawTokens) if not self.args.quiet: printf("Checking metrics for all dump files...\n") # Check for violations of HIS-CALLING after all dump files have been analyzed. self.execute_metric_check("CALLING", self.his_calling_result) # Check for violation of HIS-VOCF after all dump files have been analyzed. self.execute_metric_check("VOCF", self.his_vocf_result) # Check for violations of HIS-NRECUR after all dump files have been analyzed. self.execute_metric_check("NRECUR", self.his_num_recursions) if self.args.verify: for expected in self.verify_expected: if expected not in self.verify_actual: printf("Expected but not seen: %s\n", expected) for actual in self.verify_actual: if actual not in self.verify_expected: printf("Not expected: %s\n", actual) # Print summary if not suppressed by command line if not self.args.no_summary and not self.args.verify: printf("\n---------------------------\n") printf("--- Summary of violations\n") printf("---------------------------\n") for key in self.his_stats: if (self.his_stats[key] == "Suppressed"): printf("HIS-%s: %s\n", key.ljust(10), self.his_stats[key]) else: printf("HIS-%s: %d\n", key.ljust(10), self.his_stats[key]) printf("\n") if self.args.statistics and not self.args.verify: printf("\n---------------------------\n") printf("--- Statistics information\n") printf("---------------------------\n") for item in self.statistics_list: printf("%s\n", item) printf("\n")
def process(dumpfiles, configfile, debugprint=False): errors = [] conf = loadConfig(configfile) for afile in dumpfiles: if not afile[-5:] == '.dump': continue print('Checking ' + afile + '...') data = cppcheckdata.parsedump(afile) for cfg in data.configurations: if len(data.configurations) > 1: print('Checking ' + afile + ', config "' + cfg.name + '"...') if conf["RE_VARNAME"]: for var in cfg.variables: if var.nameToken: prev = var.nameToken.previous varType = prev.str while "*" in varType and len(varType.replace("*", "")) == 0: prev = prev.previous varType = prev.str + varType if debugprint: print("Variable Name: " + str(var.nameToken.str)) print( "original Type Name: " + str(var.nameToken.valueType.originalTypeName)) print("Type Name: " + var.nameToken.valueType.type) print("Sign: " + str(var.nameToken.valueType.sign)) print("variable type: " + varType) print("\n") print("\t-- {} {}".format(varType, str(var.nameToken.str))) if conf["skip_one_char_variables"] and len( var.nameToken.str) == 1: continue if varType in conf["var_prefixes"]: if not var.nameToken.str.startswith( conf["var_prefixes"][varType]): errors.append( reportError( var.typeStartToken.file, var.typeStartToken.linenr, 'style', 'Variable ' + var.nameToken.str + ' violates naming convention')) res = re.match(conf["RE_VARNAME"], var.nameToken.str) if not res: errors.append( reportError( var.typeStartToken.file, var.typeStartToken.linenr, 'style', 'Variable ' + var.nameToken.str + ' violates naming convention')) if conf["RE_PRIVATE_MEMBER_VARIABLE"]: # TODO: Not converted yet for var in cfg.variables: if (var.access is None) or var.access != 'Private': continue res = re.match(conf["RE_PRIVATE_MEMBER_VARIABLE"], var.nameToken.str) if not res: errors.append( reportError( var.typeStartToken.file, var.typeStartToken.linenr, 'style', 'Private member variable ' + var.nameToken.str + ' violates naming convention')) if conf["RE_FUNCTIONNAME"]: for token in cfg.tokenlist: if token.function: retval = token.previous.str prev = token.previous while "*" in retval and len(retval.replace("*", "")) == 0: prev = prev.previous retval = prev.str + retval if debugprint: print("\t:: {} {}".format(retval, token.function.name)) if retval and retval in conf["function_prefixes"]: if not token.function.name.startswith( conf["function_prefixes"][retval]): errors.append( reportError( token.file, token.linenr, 'style', 'Function ' + token.function.name + ' violates naming convention')) res = re.match(conf["RE_FUNCTIONNAME"], token.function.name) if not res: errors.append( reportError( token.file, token.linenr, 'style', 'Function ' + token.function.name + ' violates naming convention')) return errors
def check_errors_with_low_confidence_when_top3_units( self, cppcheck_configuration_unit, sorted_analysis_unit_dict): # need to work on another copy of cppcheckdata # check after all errors are collected con.print_known_unit_variables() data = cppcheckdata.parsedump(self.dump_file) for c in data.configurations[:1]: break # copy token and function data from original config tokenlist = {} for t in cppcheck_configuration_unit.tokenlist: tokenlist[t.Id] = (t.isRoot, t.isDimensionless) functionlist = {} returnexprtokenlist = {} for f in cppcheck_configuration_unit.functions: functionlist[f.Id] = ( f.return_units, f.arg_units, f.return_arg_var_nr, f.return_expr_root_token, f.is_unit_propagation_based_on_constants, f.is_unit_propagation_based_on_unknown_variable, f.is_unit_propagation_based_on_weak_inference, f.maybe_generic_function) if f.return_expr_root_token: returnexprtokenlist[f.return_expr_root_token.Id] = None returntokenlist = {} for t in c.tokenlist: (isRoot, isDimensionless) = tokenlist[t.Id] t.units = [] t.isKnown = False t.is_unit_propagation_based_on_constants = False t.is_unit_propagation_based_on_unknown_variable = False t.is_unit_propagation_based_on_weak_inference = False t.isRoot = isRoot t.isDimensionless = isDimensionless if t.str == "return": returntokenlist[t.Id] = t if t.Id in returnexprtokenlist: returnexprtokenlist[t.Id] = t for f in c.functions: (return_units, arg_units, return_arg_var_nr, return_expr_root_token, is_unit_propagation_based_on_constants, is_unit_propagation_based_on_unknown_variable, is_unit_propagation_based_on_weak_inference, maybe_generic_function) = functionlist[f.Id] if return_expr_root_token: return_expr_root_token = returnexprtokenlist[ return_expr_root_token.Id] f.return_units = [] #return_units f.arg_units = [] #arg_units f.return_arg_var_nr = return_arg_var_nr f.return_expr_root_token = return_expr_root_token f.is_unit_propagation_based_on_constants = is_unit_propagation_based_on_constants f.is_unit_propagation_based_on_unknown_variable = is_unit_propagation_based_on_unknown_variable f.is_unit_propagation_based_on_weak_inference = is_unit_propagation_based_on_weak_inference f.maybe_generic_function = maybe_generic_function for arg_number in f.argument.keys(): f.arg_units.append([]) # collect return units of all functions returnlist = {} for function_dict in sorted_analysis_unit_dict.values(): if not function_dict['scopeObject'].function: continue if function_dict['scopeObject'].function.return_arg_var_nr: continue if function_dict['scopeObject'].function.maybe_generic_function: continue returnlist[function_dict['scopeObject'].function.Id] = [] for root_token in function_dict['root_tokens']: if root_token.str == 'return': t = returntokenlist[root_token.Id] self.check_error_when_top3_units(t) #RETURN STATEMENT WITH UNITS - STORE UNITS if t.units: for u in t.units: if u not in returnlist[ function_dict['scopeObject'].function.Id]: returnlist[function_dict['scopeObject']. function.Id].append(u) tw = TreeWalker(None) tw.generic_recurse_and_apply_function(t, tw.reset_tokens) for f in c.functions: return_units = returnlist.get(f.Id) if return_units: f.return_units = return_units # check all errors for e in self.all_errors: con.FOUND_DERIVED_CU_VARIABLE = False if e.is_warning: continue if e.dont_check_for_warning: continue if e.ERROR_TYPE == UnitErrorTypes.ADDITION_OF_INCOMPATIBLE_UNITS or \ e.ERROR_TYPE == UnitErrorTypes.COMPARISON_INCOMPATIBLE_UNITS: root_token = None # find token in the copy for t in c.tokenlist: if t.Id == e.token.Id: root_token = t break if not root_token: continue self.check_error_when_top3_units(root_token) if e.ERROR_TYPE == UnitErrorTypes.ADDITION_OF_INCOMPATIBLE_UNITS: if con.FOUND_DERIVED_CU_VARIABLE: if len(root_token.units) > 2: e.is_warning = True elif root_token.units: e.is_warning = True else: units = [] if root_token.astOperand1 and root_token.astOperand2: left_units = root_token.astOperand1.units right_units = root_token.astOperand2.units if not left_units: pass #units = right_units elif not right_units: pass #units = left_units else: for lu in left_units: if lu in right_units: units.append(lu) if con.FOUND_DERIVED_CU_VARIABLE: if len(units) > 2: e.is_warning = True elif units: e.is_warning = True tw = TreeWalker(None) tw.generic_recurse_and_apply_function(root_token, tw.reset_tokens) elif e.ERROR_TYPE == UnitErrorTypes.VARIABLE_MULTIPLE_UNITS: i = 0 root_token = None left_token = None # find token in the copy for t in c.tokenlist: if t.Id == e.token.Id: root_token = t i = i + 1 elif t.Id == e.token_left.Id: left_token = t i = i + 1 if i == 2: break if (not root_token) or (not left_token): continue elif not root_token.astOperand2: continue elif not root_token.astOperand1: continue self.check_error_when_top3_units(root_token.astOperand1) self.check_error_when_top3_units(root_token.astOperand2) if (not left_token.isKnown ): #and root_token.astOperand2.units: if con.FOUND_DERIVED_CU_VARIABLE: if len(root_token.astOperand2.units) > 2: e.is_warning = True elif root_token.astOperand2.units: e.is_warning = True else: if root_token.astOperand2.units and ( root_token.astOperand1.units == root_token.astOperand2.units): e.is_warning = True if (not e.is_warning) and ( root_token.astOperand1. is_unit_propagation_based_on_weak_inference): units = [] for lu in root_token.astOperand1.units: if lu in root_token.astOperand2.units: units.append(lu) if con.FOUND_DERIVED_CU_VARIABLE: if len(units) > 2: e.is_warning = True elif units: e.is_warning = True #if not con.is_df_constraint_present(e.token_left, e.var_name): # if root_token.astOperand2.units: #and (root_token.astOperand1.units == root_token.astOperand2.units): # units = [] # for lu in root_token.astOperand1.units: # if lu in root_token.astOperand2.units: # units.append(lu) # if con.FOUND_DERIVED_CU_VARIABLE: # if len(units) > 2: # e.is_warning = True # elif units: # e.is_warning = True tw = TreeWalker(None) tw.generic_recurse_and_apply_function(root_token, tw.reset_tokens)
tokens = tokens[1:] if not token: continue if token.str == '(': if (token.previous.str[0].isalpha()): inputs.append(token.previous.str) elif token.str[0].isalpha(): inputs.append(token.str) elif token.str[0].isdigit(): inputs.append(token.str) else: tokens.append(token.astOperand1) tokens.append(token.astOperand2) return str(inputs) data = cppcheckdata.parsedump('test/1.c.dump') for scope in data.scopes: if scope.type == 'Function': print scope.className tok = scope.classStart while tok and tok != scope.classEnd: if tok.astOperand1: astTop = tok while astTop.astParent: astTop = astTop.astParent if astTop.str == '=' and astTop.astOperand1.variable: print(astTop.Id + ' ' + astTop.astOperand1.str + ":=" + expr(astTop.astOperand2)) if astTop.str == 'return':
def main_run_check(self, dump_file, source_file=''): ''' PEFORM UNITS CHECKING todo: pep8 input: a cppcheck 'dump' file containing an Abstract Syntax Tree (AST), symbol table, and token list. returns: None side-effects: updates datbase with information about this unit analysis ''' if self.debug_verbose: print inspect.stack()[0][3] self.source_file = source_file self.current_file_under_analysis = dump_file # PARSE INPUT data = cppcheckdata.parsedump(dump_file) analysis_unit_dict = {} # INITIALIZE ERROR CHECKING OBJECT self.error_checker = ErrorChecker(self.debug, source_file, self) # GIVE TREE WALKER ACCESS TO SOURCE FILE FOR DEBUG PRINT if self.source_file and self.debug and os.path.exists( self.source_file): with open(self.source_file) as f: self.source_file_lines = f.readlines() print "yes" else: if self.debug: print "no %s %s" % (self.source_file, self.debug) # for c in data.configurations: #todo: what is a data configuration? -- Check for multiple for c in data.configurations[: 1]: # MODIFIED TO ONLY TEST THE FIRST CONFIGURATION self.current_configuration = c # ADD AST DECORATION PLACEHOLDERS c = self.init_cppcheck_config_data_structures(c) # REFRESH VARIABLES self.function_graph = nx.DiGraph() # GET DICT OF ALL GLOBALLY SCOPED FUNCTIONS analysis_unit_dict = self.find_functions(c) sorted_analysis_unit_dict = analysis_unit_dict # WILL BECOME AN ORDERED DICT IF self.should_sort_by_function_graph # FIND ORDER FOR FUNCTION GRAPH EXPLORATION ( topo sort, if possible, otherwise todo ??) if self.should_sort_by_function_graph: self.build_function_graph( analysis_unit_dict) # WILL USE DAG SUBGRAPH sorted_analysis_unit_dict = self.make_sorted_analysis_unit_dict_from_function_graph( analysis_unit_dict) # RETURNS ORDERED DICT self.all_sorted_analysis_unit_dicts.append( sorted_analysis_unit_dict) # SPECIAL MODE FOR COUNTING / IDENTIFYING FILES WITH ROS UNITS if self.SHOULD_ONLY_FIND_FILES_WITH_UNITS and self.found_ros_units_in_this_file: return ##print "G nodes:%d edges:%d" % (G.order(), G.size()) if self.debug_print_function_topo_sort: self.debug_print_function_graph(sorted_analysis_unit_dict) # COLLECT ALL TOKEN PARSE TREES FOR EACH FUNCTION for function_dict in sorted_analysis_unit_dict.values(): self.analyze_function(function_dict) # SAVE OFF A POINTER TO THE TREE WALKERS USED IN THIS ANALYSIS FOR TESTING self.all_tree_walkers.append(self.tw) self.debug_function_count += 1 # if self.debug: # for e in self.errors: # print e['error_msg'] # self.error_checker.error_check_function_args_consistent(c) # for e in self.error_checker.all_errors: # self.errors.append(e) # ADD ERROR OBJECTS FOUND BY ERROR CHECKER TO MY ERRORS if self.debug: self.error_checker.pretty_print() # DEBUG COUNTERS self.debug_configuration_count += 1 self.debug_function_count = 0 ## - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ## OPTIONAL FIND UNITS ONLY AND WRITE TO DATABASE ## - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - if self.SHOULD_ONLY_FIND_FILES_WITH_UNITS and self.SHOULD_FIND_ALL_UNITS: self.insert_file_unit_class_records() return ## - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ## MAIN ERROR CHECKING ## - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # for c in data.configurations: for c in data.configurations[:1]: # FIRST ONLY # CHECK THIS CONFIGURATION FOR ERRORS self.error_checker.error_check_multiple_units(c) self.error_checker.error_check_function_args_consistent(c) # self.error_checker.error_check_unit_smell(c) # todo: when the same error occurs in multiple data configs, where is a good place to catch that # todo: add checking of addition on RH for sorted_analysis_unit_dict in self.all_sorted_analysis_unit_dicts: self.error_checker.error_check_comparisons( sorted_analysis_unit_dict) self.error_checker.error_check_logical_operators( sorted_analysis_unit_dict) self.error_checker.error_check_addition_of_incompatible_units( sorted_analysis_unit_dict) # GET ERRORS FROM ERROR CHECKER OBJECT (add to list of pointers) for e in self.error_checker.all_errors: self.errors.append(e) # COUNTS ERRORS BY TYPE # self.list_of_error_counts_by_type[e.ERROR_TYPE] += 1 if self.SHOULD_WRITE_RESULTS_TO_DATABASE: if self.errors: self.update_database(self.errors)
arg1 = None arg2 = None if token.astOperand2 and token.astOperand2.str == ',': if token.astOperand2.astOperand1 and token.astOperand2.astOperand1.str == ',': arg1 = token.astOperand2.astOperand1.astOperand1 arg2 = token.astOperand2.astOperand1.astOperand2 if token.astOperand1.str == 'memcmp' and (isLocalUnpackedStruct(arg1) or isLocalUnpackedStruct(arg2)): reportError( token, 'style', "Comparison of struct padding data " + "(fix either by packing the struct using '#pragma pack' or by rewriting the comparison)", 'cert-EXP42-C') # EXP46-C # Do not use a bitwise operator with a Boolean-like operand # int x = (a == b) & c; def exp46(data): for token in data.tokenlist: if isBitwiseOp(token) and (isComparisonOp(token.astOperand1) or isComparisonOp(token.astOperand2)): reportError( token, 'style', 'Bitwise operator is used with a Boolean-like operand', 'cert-EXP46-c') for arg in sys.argv[1:]: print('Checking ' + arg + '...') data = cppcheckdata.parsedump(arg) for cfg in data.configurations: if len(data.configurations) > 1: print('Checking ' + arg + ', config "' + cfg.name + '"...') exp42(cfg) exp46(cfg)
def check_y2038_safe(dumpfile, quiet=False): # at the start of the check, we don't know if code is Y2038 safe y2038safe = False # load XML from .dump file data = cppcheckdata.parsedump(dumpfile) # Convert dump file path to source file in format generated by cppcheck. # For example after the following call: # cppcheck ./src/my-src.c --dump # We got 'src/my-src.c' value for 'file' field in cppcheckdata. srcfile = dumpfile.rstrip('.dump') srcfile = os.path.expanduser(srcfile) srcfile = os.path.normpath(srcfile) # go through each configuration for cfg in data.configurations: cfg = data.Configuration(cfg) if not quiet: print('Checking ' + srcfile + ', config "' + cfg.name + '"...') safe_ranges = [] safe = -1 time_bits_defined = False srclinenr = '0' for directive in cfg.directives: # track source line number if directive.file == srcfile: srclinenr = directive.linenr # check for correct _TIME_BITS if present if re_define_time_bits_64.match(directive.str): time_bits_defined = True elif re_define_time_bits.match(directive.str): cppcheckdata.reportError( directive, 'error', '_TIME_BITS must be defined equal to 64', 'y2038', 'type-bits-not-64') time_bits_defined = False y2038safe = False elif re_undef_time_bits.match(directive.str): time_bits_defined = False # check for _USE_TIME_BITS64 (un)definition if re_define_use_time_bits64.match(directive.str): safe = int(srclinenr) # warn about _TIME_BITS not being defined if not time_bits_defined: cppcheckdata.reportError( directive, 'warning', '_USE_TIME_BITS64 is defined but _TIME_BITS was not', 'y2038', 'type-bits-undef') elif re_undef_use_time_bits64.match(directive.str): unsafe = int(srclinenr) # do we have a safe..unsafe area? if unsafe > safe > 0: safe_ranges.append((safe, unsafe)) safe = -1 # check end of source beyond last directive if len(cfg.tokenlist) > 0: unsafe = int(cfg.tokenlist[-1].linenr) if unsafe > safe > 0: safe_ranges.append((safe, unsafe)) # go through all tokens for token in cfg.tokenlist: if token.str in id_Y2038: if not any(lower <= int(token.linenr) <= upper for (lower, upper) in safe_ranges): cppcheckdata.reportError(token, 'warning', token.str + ' is Y2038-unsafe', 'y2038', 'unsafe-call') y2038safe = False token = token.next return y2038safe
print('Fatal error: file is not found: ' + filename) sys.exit(1) loadRuleTexts(filename) elif ".dump" in arg: continue elif arg == "-generate-table": generateTable() else: print('Fatal error: unhandled argument ' + arg) sys.exit(1) for arg in sys.argv[1:]: if not arg.endswith('.dump'): continue data = cppcheckdata.parsedump(arg) CHAR_BIT = data.platform.char_bit SHORT_BIT = data.platform.short_bit INT_BIT = data.platform.int_bit LONG_BIT = data.platform.long_bit LONG_LONG_BIT = data.platform.long_long_bit POINTER_BIT = data.platform.pointer_bit if VERIFY: VERIFY_ACTUAL = [] VERIFY_EXPECTED = [] for tok in data.rawTokens: if tok.str.startswith('//') and 'TODO' not in tok.str: compiled = re.compile(r'[0-9]+\.[0-9]+') for word in tok.str[2:].split(' '):
def process(dumpfiles, configfile, debugprint=False): errors = [] conf = loadConfig(configfile) for afile in dumpfiles: if not afile[-5:] == '.dump': continue print('Checking ' + afile + '...') data = cppcheckdata.parsedump(afile) for cfg in data.configurations: if len(data.configurations) > 1: print('Checking ' + afile + ', config "' + cfg.name + '"...') if conf["RE_VARNAME"]: for var in cfg.variables: if var.nameToken: prev = var.nameToken.previous varType = prev.str while "*" in varType and len(varType.replace("*", "")) == 0: prev = prev.previous varType = prev.str + varType if debugprint: print("Variable Name: " + str(var.nameToken.str)) print("original Type Name: " + str(var.nameToken.valueType.originalTypeName)) print("Type Name: " + var.nameToken.valueType.type) print("Sign: " + str(var.nameToken.valueType.sign)) print("variable type: " + varType) print("\n") print("\t-- {} {}".format(varType, str(var.nameToken.str))) if conf["skip_one_char_variables"] and len(var.nameToken.str) == 1: continue if varType in conf["var_prefixes"]: if not var.nameToken.str.startswith(conf["var_prefixes"][varType]): errors.append(reportError( var.typeStartToken.file, var.typeStartToken.linenr, 'style', 'Variable ' + var.nameToken.str + ' violates naming convention')) res = re.match(conf["RE_VARNAME"], var.nameToken.str) if not res: errors.append(reportError(var.typeStartToken.file, var.typeStartToken.linenr, 'style', 'Variable ' + var.nameToken.str + ' violates naming convention')) if conf["RE_PRIVATE_MEMBER_VARIABLE"]: # TODO: Not converted yet for var in cfg.variables: if (var.access is None) or var.access != 'Private': continue res = re.match(conf["RE_PRIVATE_MEMBER_VARIABLE"], var.nameToken.str) if not res: errors.append(reportError(var.typeStartToken.file, var.typeStartToken.linenr, 'style', 'Private member variable ' + var.nameToken.str + ' violates naming convention')) if conf["RE_FUNCTIONNAME"]: for token in cfg.tokenlist: if token.function: retval = token.previous.str prev = token.previous while "*" in retval and len(retval.replace("*", "")) == 0: prev = prev.previous retval = prev.str + retval if debugprint: print("\t:: {} {}".format(retval, token.function.name)) if retval and retval in conf["function_prefixes"]: if not token.function.name.startswith(conf["function_prefixes"][retval]): errors.append(reportError( token.file, token.linenr, 'style', 'Function ' + token.function.name + ' violates naming convention')) res = re.match(conf["RE_FUNCTIONNAME"], token.function.name) if not res: errors.append(reportError( token.file, token.linenr, 'style', 'Function ' + token.function.name + ' violates naming convention')) return errors