def get_state(action: Action, job: JobLog, status: str) -> Tuple[Optional[str], List[str], List[str]]: sub_action = None if job and job.sub_action: sub_action = job.sub_action if status == config.Job.SUCCESS: multi_state_set = action.multi_state_on_success_set multi_state_unset = action.multi_state_on_success_unset state = action.state_on_success if not state: log.warning('action "%s" success state is not set', action.name) elif status == config.Job.FAILED: state = getattr_first('state_on_fail', sub_action, action) multi_state_set = getattr_first('multi_state_on_fail_set', sub_action, action) multi_state_unset = getattr_first('multi_state_on_fail_unset', sub_action, action) if not state: log.warning('action "%s" fail state is not set', action.name) else: log.error('unknown task status: %s', status) state = None multi_state_set = [] multi_state_unset = [] return state, multi_state_set, multi_state_unset
def set_obj_state(obj_type, obj_id, state): if obj_type == 'adcm': return None if obj_type not in ('cluster', 'service', 'host', 'provider'): log.error('Unknown object type: "%s"', obj_type) return None return post_event('change_state', obj_type, obj_id, 'state', state)
def load_social_auth(): try: adcm = ADCM.objects.filter() if not adcm: return except OperationalError: return except AdcmEx as error: # This code handles the "JSON_DB_ERROR" error that occurs when # the "0057_auto_20200831_1055" migration is applied. In the "ADCM" object, # the "stack" field type was changed from "TextField" to "JSONField", so the "stack" field # contained an empty string, which is not a valid json format. # This error occurs due to the fact that when "manage.py migrate" is started, the "urls.py" # module is imported, in which the "load_social_auth()" function is called. if error.code == 'JSON_DB_ERROR': executor = MigrationExecutor(connections[DEFAULT_DB_ALIAS]) if ('cm', '0057_auto_20200831_1055' ) not in executor.loader.applied_migrations: return raise error try: cl = ConfigLog.objects.get(obj_ref=adcm[0].config, id=adcm[0].config.current) prepare_social_auth(cl.config) except OperationalError as e: log.error('load_social_auth error: %s', e)
def backup_db(): if not check_migrations(): return db = DATABASES['default'] if db['ENGINE'] != 'django.db.backends.sqlite3': log.error('Backup for %s not implemented yet', db['ENGINE']) return backup_sqlite(db['NAME'])
def complete(request, *args, **kwargs): try: return social_django.views.complete(request, 'google-oauth2', *args, **kwargs) except AuthForbidden as e: log.error("social AUTH_ERROR: %s", e) params = urlencode({'error_code': 'AUTH_ERROR', 'error_msg': e}) return redirect(f"/login/?{params}")
def run_job(task_id, job_id, err_file): log.debug("run job #%s of task #%s", job_id, task_id) try: proc = subprocess.Popen([ os.path.join(config.CODE_DIR, 'job_runner.py'), str(job_id) ], stderr=err_file) res = proc.wait() return res except: log.error("exception runnung job %s", job_id) return 1
def run_task(task_id, args=None): log.debug("task_runner.py called as: %s", sys.argv) try: task = TaskLog.objects.get(id=task_id) except ObjectDoesNotExist: log.error("no task %s", task_id) return jobs = JobLog.objects.filter(task_id=task.id).order_by('id') if not jobs: log.error("no jobs for task %s", task.id) cm.job.finish_task(task, None, config.Job.FAILED) return err_file = open(os.path.join(config.LOG_DIR, 'job_runner.err'), 'a+', encoding='utf_8') log.info("run task #%s", task_id) job = None count = 0 res = 0 for job in jobs: if args == 'restart' and job.status == config.Job.SUCCESS: log.info('skip job #%s status "%s" of task #%s', job.id, job.status, task_id) continue cm.job.re_prepare_job(task, job) job.start_date = timezone.now() job.save() res = run_job(task.id, job.id, err_file) set_body_ansible(job) # For multi jobs task object state and/or config can be changed by adcm plugins if task.task_object is not None: try: task.task_object.refresh_from_db() except ObjectDoesNotExist: task.object_id = 0 task.object_type = None count += 1 if res != 0: break if res == 0: cm.job.finish_task(task, job, config.Job.SUCCESS) else: cm.job.finish_task(task, job, config.Job.FAILED) err_file.close() log.info("finish task #%s, ret %s", task_id, res)
def load_social_auth(): try: adcm = ADCM.objects.filter() if not adcm: return except OperationalError: return try: cl = ConfigLog.objects.get(obj_ref=adcm[0].config, id=adcm[0].config.current) prepare_social_auth(cl.config) except OperationalError as e: log.error('load_social_auth error: %s', e)
def run_job(task_id, job_id, err_file): log.debug("task run job #%s of task #%s", job_id, task_id) cmd = [ '/adcm/python/job_venv_wrapper.sh', TaskLog.objects.get(id=task_id).action.venv, os.path.join(config.CODE_DIR, 'job_runner.py'), str(job_id), ] log.info("task run job cmd: %s", ' '.join(cmd)) try: proc = subprocess.Popen(cmd, stderr=err_file) res = proc.wait() return res except: log.error("exception runnung job %s", job_id) return 1
def restore_hc(task, action, status): if status != config.Job.FAILED: return if not action.hostcomponentmap: return selector = task.selector if 'cluster' not in selector: log.error('no cluster in task #%s selector', task.id) return cluster = Cluster.objects.get(id=selector['cluster']) host_comp_list = [] for hc in task.hostcomponentmap: host = Host.objects.get(id=hc['host_id']) service = ClusterObject.objects.get(id=hc['service_id'], cluster=cluster) comp = ServiceComponent.objects.get(id=hc['component_id'], cluster=cluster, service=service) host_comp_list.append((service, host, comp)) log.warning('task #%s is failed, restore old hc', task.id) api.save_hc(cluster, host_comp_list)
def get_task_obj(context, obj_id): def get_obj_safe(model, obj_id): try: return model.objects.get(id=obj_id) except model.DoesNotExist: return None if context == 'service': obj = get_obj_safe(ClusterObject, obj_id) elif context == 'host': obj = get_obj_safe(Host, obj_id) elif context == 'cluster': obj = Cluster.objects.get(id=obj_id) elif context == 'provider': obj = HostProvider.objects.get(id=obj_id) elif context == 'adcm': obj = ADCM.objects.get(id=obj_id) else: log.error("unknown context: %s", context) return None return obj
def restore_hc(task: TaskLog, action: Action, status: str): if status != config.Job.FAILED: return if not action.hostcomponentmap: return cluster = get_object_cluster(task.task_object) if cluster is None: log.error('no cluster in task #%s', task.pk) return host_comp_list = [] for hc in task.hostcomponentmap: host = Host.objects.get(id=hc['host_id']) service = ClusterObject.objects.get(id=hc['service_id'], cluster=cluster) comp = ServiceComponent.objects.get(id=hc['component_id'], cluster=cluster, service=service) host_comp_list.append((service, host, comp)) log.warning('task #%s is failed, restore old hc', task.pk) api.save_hc(cluster, host_comp_list)
def get_state(action, job, status): sub_action = None if job and job.sub_action_id: sub_action = SubAction.objects.get(id=job.sub_action_id) if status == config.Job.SUCCESS: if not action.state_on_success: log.warning('action "%s" success state is not set', action.name) state = None else: state = action.state_on_success elif status == config.Job.FAILED: if sub_action and sub_action.state_on_fail: state = sub_action.state_on_fail elif action.state_on_fail: state = action.state_on_fail else: log.warning('action "%s" fail state is not set', action.name) state = None else: log.error('unknown task status: %s', status) state = None return state
def api_get(path): url = API_URL + path try: r = requests.get(url, headers={ 'Content-Type': 'application/json', 'Authorization': 'Token ' + STATUS_SECRET_KEY }, timeout=TIMEOUT) if r.status_code not in (200, 201): log.error("GET %s error %d: %s", url, r.status_code, r.text) return r except requests.exceptions.Timeout: log.error("GET request to %s timed out", url) return None except requests.exceptions.ConnectionError: log.error("GET request to %s connection failed", url) return None
def raise_AdcmEx(code, msg='', args=''): (_, err_msg, _, _) = get_error(code) if msg != '': err_msg = msg log.error(err_msg) raise AdcmEx(code, msg=msg, args=args)