def getallparandbnds(self,PropAxesColl,pnameorder,**kwargs): #global pcollect pcollect=[];b1collect=[];b2collect=[];ncollect=[];mapcollect=[] maxlen=0 for pname in pnameorder: if 'inclfilt' in kwargs: a,b1,b2,c,d=self.getparandbnds(PropAxesColl,pname,inclfilt=kwargs['inclfilt']) else: a,b1,b2,c,d=self.getparandbnds(PropAxesColl,pname,kwargs) if len(a) > maxlen: maxlen=len(a) if len(a) < maxlen: pcollect.append(np.array(a*maxlen)) b1collect.append(np.array(b1*maxlen)) b2collect.append(np.array(b2*maxlen)) ncollect.append(c) mapcollect.append(d) else: pcollect.append(a) b1collect.append(b1) b2collect.append(b2) ncollect.append(c) mapcollect.append(d) # return pcollect # print pcollect, 'pcoll' # print pcollect[0] # print [[l for l in pcollect] for j,i in enumerate(pcollect[0])] pcollect=[flatten([[k for k in l[j]] for l in pcollect]) for j,i in enumerate(pcollect[0])] b1collect=[flatten([[k for k in l[j]] for l in b1collect]) for j,i in enumerate(b1collect[0])] b2collect=[flatten([[k for k in l[j]] for l in b2collect]) for j,i in enumerate(b2collect[0])] return pcollect,b1collect,b2collect,ncollect,mapcollect
def errfunctg3(parax,x,exp_data,m,dwbset,err,g,fl): """The main error function for the global relaxation dispersion data fitting process""" value=[] parnocoll=[] cestchi=[];rdchi=[] for l,i in enumerate(fl): a=[parax[j] for j in i] parnocoll.append([j for j in i]) val=np.array((flatten([multifunctg2(a,x[l],m[l],dwbset[l],g[l])]))) value.append(val) if g[l][0] > 10: cestchi.append((np.array(exp_data[l])-np.array(val))/(np.array(err[l]))) else: rdchi.append((np.array(exp_data[l])-np.array(val))/(np.array(err[l])))# print np.average(((np.array(exp_data[l])-np.array(val))/(np.array(err[l])))**2), g[l][0] paramno=len(set(flatten(parnocoll))) x1=((len(flatten(cestchi)))/(len(flatten(cestchi))+len(flatten(rdchi)))) x2=(np.sqrt(2)*np.array(flatten(cestchi))*(1/np.sqrt(len(flatten(exp_data))-paramno))) x3=((len(flatten(rdchi)))/(len(flatten(cestchi))+len(flatten(rdchi)))) x4=(np.sqrt(2)*np.array(flatten(rdchi))*(1/np.sqrt(len(flatten(exp_data))-paramno))) return np.array(flatten([x4,x2]))
def threetriangfunctionx(dwa, dwb, dwc, k12, k13, k23, pb, pc, w1, R1, R2): """Three-site R1rho-type exchange for CEST fits including R1 relaxation""" return np.array( [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, -R2 - k12 * pb / (-pc + 1) - k13 * pc / (-pb + 1), dwa, 0, k12 * (-pb - pc + 1) / (-pc + 1), 0, 0, k13 * (-pb - pc + 1) / (-pc + 1), 0, 0 ], [ 0, -dwa, -R2 - k12 * pb / (-pc + 1) - k13 * pc / (-pb + 1), w1, 0, k12 * (-pb - pc + 1) / (-pc + 1), 0, 0, k13 * (-pb - pc + 1) / (-pc + 1), 0 ], [ 2 * R1 * (1 - pb - pc), 0, -w1, -R1 - k12 * pb / (-pc + 1) - k13 * pc / (-pb + 1), 0, 0, k12 * (-pb - pc + 1) / (-pc + 1), 0, 0, k13 * (-pb - pc + 1) / (-pc + 1) ], [ 0, k12 * pb / (-pc + 1), 0, 0, -R2 - k12 * (-pb - pc + 1) / (-pc + 1) - k23 * pc / (pb + pc), dwb, 0, k23 * pb / (pb + pc), 0, 0 ], [ 0, 0, k12 * pb / (-pc + 1), 0, -dwb, -R2 - k12 * (-pb - pc + 1) / (-pc + 1) - k23 * pc / (pb + pc), w1, 0, k23 * pb / (pb + pc), 0 ], [ 2 * R1 * pb, 0, 0, k12 * pb / (-pc + 1), 0, -w1, -R1 - k12 * (-pb - pc + 1) / (-pc + 1) - k23 * pc / (pb + pc), 0, 0, k23 * pb / (pb + pc) ], [ 0, k13 * pc / (-pb + 1), 0, 0, k23 * pc / (pb + pc), 0, 0, -R2 - k13 * (-pb - pc + 1) / (-pc + 1) - k23 * pb / (pb + pc), dwc, 0 ], [ 0, 0, k13 * pc / (-pb + 1), 0, 0, k23 * pc / (pb + pc), 0, -dwc, -R2 - k13 * (-pb - pc + 1) / (-pc + 1) - k23 * pb / (pb + pc), w1 ], [ 2 * R1 * pc, 0, 0, k13 * pc / (-pb + 1), 0, 0, k23 * pc / (pb + pc), 0, -w1, -R1 - k13 * (-pb - pc + 1) / (-pc + 1) - k23 * pb / (pb + pc) ]])
def approxdefs(**kwargs): """general function to calculate a number of expressions used to get approximations and exact solutions. This function frequently calls itself. Pre-calculated symbolic solutions are also generated from expressions contained herein. However, the function most likely directly used by the user will be nEVapprox. The parameter "calctype" determines which type of calculation is performed. """ verb = 'quiet' #Parse arguments (parameters). Strings in the symbols list (sparlist) #will either take a value or become a symbol within the prm dictionary. #Strings in the other list (oparlist) will either take the value (from #kwargs or be set to 0: sparlist=['pa','pb','pc','pd','t','tt','k12','k13','k14','k23','k24','k34'\ ,'dwa','dwb','dwc','dwd','w1','deltao'] oparlist = [ 'calctype', 'exporder', 'mode', 'pade', 'sc', 'nDV', 'nhmatnewsc' ] prm = {} for x in sparlist + oparlist: if x in kwargs: prm[x] = kwargs[x] elif x in sparlist: prm[x] = smp.Symbol(x) else: prm[x] = 0 #We prefer variables (corresponding to values or symbols) #over bulky dictionary entries, which is why all dictionary entries are #assigned to corresponding values: pa = prm['pa'] pb = prm['pb'] pc = prm['pc'] pd = prm['pd'] t = prm['t'] tt=prm\ ['tt'] k12 = prm['k12'] k13 = prm['k13'] k14 = prm['k14'] k23 = prm['k23'] \ k24=prm['k24'] k34 = prm['k34'] dwb = prm['dwb'] dwc = prm['dwc'] dwd=prm\ ['dwd'] calctype = prm['calctype'] exporder = prm['exporder'] mode=prm\ ['mode'] pade = prm['pade'] sc = prm['sc'] nDV = prm['nDV'] dwa = prm['dwa'] \ w1=prm['w1'] deltao = prm['deltao'] #If verb is set to 'ose', then the calculation type is printed. if verb == 'ose': print 'calc ' + calctype + '... ', #calculation of the chemical shift matrix - including the exponential: #calculation of the chemical shift matrix - not including the #exponential. Similar to nAmatD. Only works in numerical mode. elif calctype == 'nAmatDnonex': if mode == 2: return np.matrix([[0,0,0,0],[0,-1j*dwb,0,0],[0,0,-1j*dwc,0],\ [0,0,0,-1j*dwd]]) if anyisnotsymbol([k14,k24,k34],0) else \ (np.matrix([[0,0,0],[0,-1j*dwb,0],[0,0,-1j*dwc]]) if \ anyisnotsymbol([k23,k13],0) else np.matrix([[0,0],\ [0,-1j*dwb]])) elif calctype == 'nAmatP': if mode == 1: [k13,k23,k14,k24,k34]=[k13,0,0,0,k34] if sc == 0 else \ ([k13,0,k14,0,0] if sc == 1 else ([k13,0,k14,0,k34] if sc == 2 \ else ([k13,0,0,k24,k34] if sc == 3 else ([0,k23,0,0,0] if sc == 4 \ else ([k13,0,0,0,0] if sc == 5 else ([k13,k23,0,0,0] if sc == 6 \ else [0,0,0,0,0])))))) pl = [k12, k13, k23, k14, k24, k34, pb, pc, pd] if ((anyisnotsymbol([k14, k24, k34], 3) or anyissymbol([k14, k24, k34]))): kineticmat=[[-kv('12',pl)-kv('13',pl)-kv('14',pl),kv('21',pl),\ kv('31',pl),kv('41',pl)],[kv('12',pl),-kv('21',pl)-kv('23',pl)\ -kv('24',pl),kv('32',pl),kv('42',pl)],[kv('13',pl),kv('23',pl),\ -kv('31',pl)-kv('32',pl)-kv('34',pl),kv('43',pl)],[kv('14',pl),\ kv('24',pl),kv('34',pl),-kv('41',pl)-kv('42',pl)-kv('43',pl)]] elif ((anyisnotsymbol([k13, k23], 3) or anyissymbol([k13, k23]))): kineticmat=[[-kv('12',pl)-kv('13',pl),kv('21',pl),kv('31',pl)],\ [kv('12',pl),-kv('21',pl)-kv('23',pl),kv('32',pl)],[kv('13',pl),\ kv('23',pl),-kv('31',pl)-kv('32',pl)]] else: kineticmat = [[-kv('12', pl), kv('21', pl)], [kv('12', pl), -kv('21', pl)]] if mode == 2: return np.matrix(kineticmat) else: return smp.Matrix(kineticmat) ##approxdefs(calctype='r1rLKmat',mode=mode,k12=k12x,k13=k13x,k14=k14x,k24=k24x,k23=k23x,k34=k34x,pb=pbx,pc=pcx,pd=pdx,deltao=deltaox,w1=w1x,dwb=dwbx,dwc=dwcx,dwd=dwdx) #elif calctype == 'cpmg1': # return np.dot(np.dot(scplin.expm(approxdefs(t=t,dwb=dwb,calctype='nAmat',mode=2)*t),scplin.expm(approxdefs(t=t,dwb=dwb,calctype='nAstar')*t*2)),scplin.expm(approxdefs(t=t,dwb=dwb,calctype='nAmat')*t)) #calculates the exact solution for the exponential matrix term elif calctype == 'cpmg2': part1=approxdefs(calctype='nAmatP',mode=2,k12=k12,k13=k13,k14=k14,k24=\ k24,k23=k23,k34=k34,pb=pb,pc=pc,pd=pd) part2=approxdefs(calctype='nAmatDnonex',mode=2,k12=k12,k13=k13,k14=k14\ ,k24=k24,k23=k23,k34=k34,dwb=dwb,dwc=dwc,dwd=dwd) return np.dot(np.dot(scplin.expm((part1+part2)*t),\ scplin.expm((part1-part2)*2*t)),\ scplin.expm((part1+part2)*t)) #elif calctype == 'cpmg2old': # return np.dot(scplin.expm((approxdefs(calctype='nAmatP',mode=2)+approxdefs(calctype='nAmatDnonex',mode=2))*2*t),scplin.expm((approxdefs(calctype='nAmatP',mode=2)-approxdefs(calctype='nAmatDnonex',mode=2))*2*t)) #For calculations of Hmat elif calctype == 'nHmatpro': #calculations which are based on calculations of the #exact exponential matrix. (all ExpExact) if mode == 2 and (exporder == 0 or exporder == 10): cpmg2mat=approxdefs(mode=2,calctype='cpmg2',t=t,\ k12=k12,k13=k13,k14=k14,k24=k24,k23=k23,k34=k34,dwb=dwb\ ,dwc=dwc,dwd=dwd,pb=pb,pc=pc,pd=pd) if pade == 0: #exact logarithm return scplin.logm(cpmg2mat) / (t * 4) elif pade == 1: #Pade 1,0 return (-np.identity(np.shape(cpmg2mat)[0]) + cpmg2mat) / (t * 4) if pade == 2 or pade == 3: xmt=-approxdefs(mode=2,pade=1,calctype='nHmatpro',t=t,\ exporder=exporder,k12=k12,k13=k13,k14=k14,k24=k24,\ k23=k23,k34=k34,dwb=dwb,dwc=dwc,dwd=dwd,pb=pb,pc=pc,\ pd=pd)*(4*t) if pade == 2: #Pade 2,2 return -(np.dot(np.linalg.inv(6*np.identity(np.shape(xmt)\ [0])-6*xmt+np.dot(xmt,xmt)),(6*xmt-np.dot(3*xmt,xmt\ ))))/(4*t) elif pade == 3: #Pade 3,3 return -(np.dot(np.linalg.inv(60*np.identity(np.shape(xmt)[0])\ -90*xmt+np.dot(36*xmt,xmt)-np.dot(3*xmt,np.dot(xmt,\ xmt))),(60*xmt-np.dot(60*xmt,xmt)+np.dot(11*xmt,\ np.dot(xmt,xmt)))))/(4*t) elif calctype == 'r1rBigL2': # print w1, '!!!!' Lmat = [[[0, -x, 0], [x, 0, -w1], [0, w1, 0]] for x in [dwa, dwb, dwc, dwd]] idm = np.matrix(np.identity(3)) if mode == 2: dimen = 4 if anyisnotsymbol([k14, k24, k34],0) else (3 if \ anyisnotsymbol([k13,k23],0) else 2) else: dimen = 4 if sc < 4 else (3 if sc < 7 else 2) if dimen == 4: BigL=np.reshape(np.array([np.transpose(np.reshape(np.array(\ [Lmat[0],0*idm,0*idm,0*idm]),(12,3))),np.transpose(np.reshape(\ np.array([0*idm,Lmat[1],0*idm,0*idm]),(12,3))),np.transpose(\ np.reshape(np.array([0*idm,0*idm,Lmat[2],0*idm]),(12,3))),np.\ transpose(np.reshape(np.array([0*idm,0*idm,0*idm,Lmat[3]]),\ (12,3)))]),(12,12),order='A') elif dimen == 3: BigL=np.reshape(np.array([np.transpose(np.reshape(np.array(\ [Lmat[0],0*idm,0*idm]),(9,3))),np.transpose(np.reshape(\ np.array([0*idm,Lmat[1],0*idm]),(9,3))),np.transpose(\ np.reshape(np.array([0*idm,0*idm,Lmat[2]]),(9,3)))]),\ (9,9),order='A') else: BigL=np.reshape(np.array([np.transpose(np.reshape(np.array(\ [Lmat[0],0*idm]),(6,3))),np.transpose(np.reshape(\ np.array([0*idm,Lmat[1]]),(6,3)))]),\ (6,6),order='A') if mode == 2: return np.matrix(BigL) else: return smp.Matrix(BigL) elif calctype == 'nAmatP2': if mode == 1: [k13,k23,k14,k24,k34]=[k13,0,0,0,k34] if sc == 0 else \ ([k13,0,k14,0,0] if sc == 1 else ([k13,0,k14,0,k34] if sc == 2 \ else ([k13,0,0,k24,k34] if sc == 3 else ([0,k23,0,0,0] if sc == 4 \ else ([k13,0,0,0,0] if sc == 5 else ([k13,k23,0,0,0] if sc == 6 \ else [0,0,0,0,0])))))) pl = [k12, k13, k23, k14, k24, k34, pb, pc, pd] if ((anyisnotsymbol([k14, k24, k34], 3) or anyissymbol([k14, k24, k34]))): kineticmat=[[-kv('12',pl)-kv('13',pl)-kv('14',pl),kv('21',pl),\ kv('31',pl),kv('41',pl)],[kv('12',pl),-kv('21',pl)-kv('23',pl)\ -kv('24',pl),kv('32',pl),kv('42',pl)],[kv('13',pl),kv('23',pl),\ -kv('31',pl)-kv('32',pl)-kv('34',pl),kv('43',pl)],[kv('14',pl),\ kv('24',pl),kv('34',pl),-kv('41',pl)-kv('42',pl)-kv('43',pl)]] elif ((anyisnotsymbol([k13, k23], 3) or anyissymbol([k13, k23]))): kineticmat=[[-kv('12',pl)-kv('13',pl),kv('21',pl),kv('31',pl)],\ [kv('12',pl),-kv('21',pl)-kv('23',pl),kv('32',pl)],[kv('13',pl),\ kv('23',pl),-kv('31',pl)-kv('32',pl)]] else: kineticmat = [[-kv('12', pl), kv('21', pl)], [kv('12', pl), -kv('21', pl)]] if mode == 2: return np.matrix(kineticmat) else: return smp.Matrix(kineticmat) elif calctype == 'r1rBigK2': ktest=np.array(approxdefs(calctype='nAmatP2',mode=mode,k12=k12,k13=k13,k14=k14,k24=\ k24,k23=k23,k34=k34,pb=pb,pc=pc,pd=pd,sc=sc)) if mode == 1: ident = smp.eye(3) z = smp.zeros(np.shape(ktest)[0] * 3) else: ident = np.identity(3) z = np.zeros((np.shape(ktest)[0] * 3, np.shape(ktest)[0] * 3)) i = 0 for x in ktest: j = 0 for y in x: z[i:i + 3, j:j + 3] = y * ident j += 3 i += 3 if mode == 2: return np.matrix(z) else: return smp.Matrix(z) elif calctype == 'r1rLKmat2': return approxdefs(calctype='r1rBigK2',mode=mode,k12=k12,k13=k13,k14=k14\ ,k24=k24,k23=k23,k34=k34,pb=pb,pc=pc,pd=pd,sc=sc)+approxdefs(\ calctype='r1rBigL2',mode=mode,sc=sc,k12=k12,k13=k13,k14=k14,k24=\ k24,k23=k23,k34=k34,pb=pb,pc=pc,pd=pd,dwa=-(deltao-pb*dwb-pc*dwc-\ pd*dwd),dwb=dwb-(deltao-pb*dwb-pc*dwc-pd*dwd),dwc=dwc-(deltao-\ pb*dwb-pc*dwc-pd*dwd),dwd=dwd-(deltao-pb*dwb-pc*dwc-pd*dwd),w1=w1) elif calctype == 'sinsqth2': if sc < 4: return (w1**2 / (w1**2 + ( (1 - pb - pc - pd) * dwa + pb * dwb + pc * dwc + pd * dwd)**2)) elif sc < 7: return (w1**2 / (w1**2 + ((1 - pb - pc) * dwa + pb * dwb + pc * dwc)**2)) else: #print w1, pb, dwa, dwb return (w1**2 / (w1**2 + ((1 - pb) * dwa + pb * dwb)**2)) elif calctype == 'r1rex': LKM=approxdefs(calctype='r1rLKmat2',mode=mode,k12=k12,k13=k13,\ k14=k14,k24=k24,k23=k23,k34=k34,pb=pb,pc=pc,\ pd=pd,deltao=deltao,w1=w1,dwa=-(deltao-pb*dwb-pc*dwc-\ pd*dwd),dwb=dwb-(deltao-pb*dwb-pc*dwc-pd*dwd),dwc=dwc-(deltao-\ pb*dwb-pc*dwc-pd*dwd),dwd=dwd-(deltao-pb*dwb-pc*dwc-pd*dwd),sc=sc) sinsqt=approxdefs(calctype='sinsqth2',mode=mode,k12=k12,k13=k13,\ k14=k14,k24=k24,k23=k23,k34=k34,pb=pb,pc=pc,\ pd=pd,deltao=deltao,w1=w1,dwa=-(deltao-pb*dwb-pc*dwc-\ pd*dwd),dwb=dwb-(deltao-pb*dwb-pc*dwc-pd*dwd),dwc=dwc-(deltao-\ pb*dwb-pc*dwc-pd*dwd),dwd=dwd-(deltao-pb*dwb-pc*dwc-pd*dwd),sc=sc) if mode == 2: return np.real( 1 / np.max(-1 / np.linalg.eigvals(LKM))).item() / sinsqt
def cestfunction(omegarflist, deltaAB, deltaAC, k12, k13, k23, pb, pc, w1x, R1, R2, B0, exactx): """ CEST functions to fit data. Input variables are mostly self-explanatory. exactx lets chose between exact solution and approximation. trad is, at the moment, hard-coded. The position of the CEST dip, which is important and refers to "position 0" depends on whether the dominant site is in slow or fast exchange with any of the minor site(s). This position is calculated initially. """ cest = [] #R1=10 trad = 0.4 deltaA0 = 0 w1 = w1x * (2 * np.pi) if k12 > deltaAB * B0 and k13 <= deltaAC * B0: deltaA0 = -deltaAB * pb / (1 - pc) deltaB0 = deltaAB - deltaAB * pb / (1 - pc) deltaC0 = deltaAC - deltaAB * pb / (1 - pc) elif k12 <= deltaAB * B0 and k13 <= deltaAC * B0: deltaA0 = 0 deltaB0 = deltaAB deltaC0 = deltaAC elif k12 <= deltaAB * B0 and k13 > deltaAC * B0: deltaA0 = -deltaAC * pc / (1 - pb) deltaB0 = deltaAB - deltaAC * pc / (1 - pb) deltaC0 = deltaAC - deltaAC * pc / (1 - pb) elif k12 > deltaAB * B0 and k13 > deltaAC * B0: deltaA0 = -deltaAB * pb - deltaAC * pc deltaB0 = deltaAB - deltaAB * pb - deltaAC * pc deltaC0 = deltaAC - deltaAB * pb - deltaAC * pc for omegarf in omegarflist: dwa = (deltaA0 * B0 - (2 * np.pi) * omegarf * B0 * 500) dwb = (deltaB0 * B0 - (2 * np.pi) * omegarf * B0 * 500) dwc = (deltaC0 * B0 - (2 * np.pi) * omegarf * B0 * 500) omegaBar = (1 - pb - pc) * dwa + pb * dwb + pc * dwc we = np.sqrt(w1**2 + omegaBar**2) cos2t = (omegaBar / we)**2 if exactx == 1: Z = threetriangfunctionx(dwa, dwb, dwc, k12, k13, k23, pb, pc, w1, R1, R2) at = expm(trad * Z) m0 = np.array([0.5, 0, 0, 1 - pb - pc, 0, 0, pb, 0, 0, pc]) m1 = np.array([0.5, 0, 0, -(1 - pb - pc), 0, 0, -pb, 0, 0, -pc]) magA = at[3, 3] * m0[3] + at[3, 6] * m0[6] + at[3, 9] * m0[9] magA = magA - (at[3, 3] * m1[3] + at[3, 6] * m1[6] + at[3, 9] * m1[9]) magB = at[6, 3] * m0[3] + at[6, 6] * m0[6] + at[6, 9] * m0[9] magB = magB - (at[6, 3] * m1[3] + at[6, 6] * m1[6] + at[6, 9] * m1[9]) magC = at[9, 3] * m0[3] + at[9, 6] * m0[6] + at[9, 9] * m0[9] magC = magC - (at[9, 3] * m1[3] + at[9, 6] * m1[6] + at[9, 9] * m1[9]) mag = (magA + magB + magC) / 2 # print omegarf,dwa,dwb,dwc,k12,k13,k23,pb,pc,w1,R1,R2, mag cest.append(mag) return cest[0] / (cos2t * np.exp(-R1 * trad))
def runfit4(praxs1,ctd,selresidues,precalc,resnam,conditions,path2020,savstatdir,files,filenamsav,paramsx,drawonly,cond): """ This function prepares the global fit and converts data structures where appropriate. input arguments: praxs1: property axis collections ctd not used precalc: triggers different preparatory routines depending on this switch 0: regular fit, no pre-calculated data (spinsystems object) containing pre-calcualted theoretical data for resampling """ moreconditions=[ctd,selresidues,precalc,resnam,files] if precalc != 0 and precalc != 1: spinsystems=hkio.loadss(savstatdir,precalc) reslalmall=selresidues[0]#[reslall[i] for i in pickthese] shuffletype=[['cpmg','dataset'],['Rex','dataset'],['cest','each']] spinsystems=reshuffle(spinsystems,reslalmall,shuffletype) elif precalc == 0: spinsystems,setlabels=prepro.launch(path2020,files) print spinsystems[35].name, spinsystems[35].datasets[0].rcpmg # print spinsystems[57].datasets[13].xlabel resultcolll=[]; poscolll=[] resnaml=[] allresultcoll=[] """This is a loop allowing to test various residue combination sets.""" for selectdatn in selresidues: """The experimental data are stored in the spinsystems object; the parameters are stored in the parameters object, using certain property axes. For the fitting engine, all data are flattened form. For each data point, there is a corresponding pointer set in a list of equal lengths which selects the appropriate parameters. All flattened lists, including pointer lists, are generated here. """ resnaml,timedat,rawdata,errd,field,field2,field3,tr,equationtype,poscoll,expcnd=prepro.passdatatofitn(spinsystems,selectdatn,precalc) """hard-coded filters, has to be modified for other experimental combinations""" setprotoncpmg=1 if setprotoncpmg == 1: filters=[[['residues','name'],[[rn] for rn in resnaml]],[['conc','value'],\ [['2.475'],['9.9']]],[['TR','name'],[['T'],['X']]],\ [['B1field','rounded'],[[500],[600],[800],[900]]],\ [['type','name'],[['cpmg'],['Rex'],['cest']]]] filters2=[[['conc','value'],[['2.475'],['9.9']]],\ [['TR','name'],[['T'],['X']]],\ [['B1field','rounded'],[[500],[600],[800],[900]]],\ [['type','name'],[['cpmg'],['Rex'],['cest']]]] else: filters=[[['residues','name'],[[rn] for rn in resnaml]],[['conc','value'],\ [['2.475'],['9.9']]],[['TR','name'],[['T'],['X']]],\ [['B1field','rounded'],[[50],[70],[80],[90]]],\ [['type','name'],[['cpmg'],['Rex'],['cest']]]] filters2=[[['conc','value'],[['2.475'],['9.9']]],\ [['TR','name'],[['T'],['X']]],\ [['B1field','rounded'],[[50],[70],[80],[90]]],\ [['type','name'],[['cpmg'],['Rex'],['cest']]]] selset=[] q=0 explist=[] for j,i in enumerate(expcnd): for l,k in enumerate(i): selsetx=[j] for n,m in enumerate(filters2): selsetx.append([p for p,o in enumerate(m[1]) if k[m[0][0]] in o][0]) selset.append(selsetx) explist.append(q) q+=1 filt=[] aa=selset bb=set(tuple(ix) for ix in selset) bb=[list(b) for b in aa] seldatasets=list(np.arange(len(aa))) inclfx=[] for l,i in enumerate(bb): inclf=[] for k,j in enumerate(i): inclf.append([filters[k][0][0],filters[k][0][1],filters[k][1][j]]) inclfx.append(inclf) a,b1,b2,c,e=paramsx.getallparandbnds(praxs1,['p','k','dw','R20500','R2mult'],inclfilt=inclf) f=flatten([np.array(e[m][:-1])+np.sum([k for k in [0]+[e[j][-1] for \ j,i in enumerate(e) if j < len(e)-1]][0:(l+1)]) for m,l in \ enumerate(np.arange(len(e)))]) filt.append(f) timedat=[flatten(timedat,levels=1)[i] for i in seldatasets] rawdata=[flatten(rawdata,levels=1)[i] for i in seldatasets] errd=[flatten(errd,levels=1)[i] for i in seldatasets] field=[flatten(field,levels=1)[i] for i in seldatasets] field2=[flatten(field2,levels=1)[i] for i in seldatasets] field3=[flatten(field3,levels=1)[i] for i in seldatasets] tr=[flatten(tr,levels=1)[i] for i in seldatasets] equationtype=[flatten(equationtype,levels=1)[i] for i in seldatasets] a,b1,b2,c,e=paramsx.getallparandbnds(praxs1,['p','k','dw','R20500','R2mult'],inclfilt=[]) f=flatten([np.array(e[m][:-1])+np.sum([k for k in [0]+[e[j][-1] for \ j,i in enumerate(e) if j < len(e)-1]][0:(l+1)]) for m,l \ in enumerate(np.arange(len(e)))]) poscolll.append(poscoll) if drawonly == 0: sojetzt,allrescoll=fitcpmg4(praxs1,timedat,rawdata,field,errd,\ precalc,equationtype,conditions,savstatdir,filenamsav,resnaml, poscolll,\ moreconditions,paramsx,filt) else: dwbsetp=[] setdwb=0 for i in np.arange(len(timedat)): for j in np.arange(len(flatten(timedat[i]))): dwbsetp.append(setdwb) setdwb+=1 par7=np.array(np.array(dwbsetp).astype('int')) fittedcurve,chsq0,chsq1,chsq2,chsq3=printrd(praxs1,timedat,\ rawdata,field,errd,equationtype,par7,paramsx,filt) for j,i in enumerate(seldatasets): print i, 'datasetno', chsq0[j], equationtype[j][0] if drawonly == 0: try: resultcolll.append(sojetzt) allresultcoll.append(allrescoll) except: print 'ugh1' else: for j in poscoll: for k,i in enumerate(seldatasets): #fittedcurve: spinsystems[j].datasets[i].fit=fittedcurve[k] if drawonly == 0: return resultcolll, resnaml, poscolll, spinsystems, allresultcoll else: return spinsystems#else:
def reshuffle(ss,reslalmall,shuffletype): """ Resampling for error calculation by adding or subtracting residuals from calculated experimental data. """ signrnd=1 heteroskedacity=1 withoutreplacement=0 allresid={} alldsref={} dplcoll=[] datatypes=[i[0] for i in shuffletype] for dt in datatypes: allresid[dt]=[] alldsref[dt]=[] dsnum=0 """calculating residuals""" for spinsyst in ss: selnam = spinsyst.name[0] if selnam in reslalmall: dpl=[] resid={} dsref={} if spinsyst.datasets[-1].setselect > dsnum: dsnum=spinsyst.datasets[-1].setselect for dt in datatypes: resid[dt]=[] dsref[dt]=[] # setparameters2=[dataname,'/home/hanskoss/data/Cadherin/nmrCad/procandcoll/TSnewsort/2020Feb/',[selnam],conditions,namresults] for ds in spinsyst.datasets: if signrnd != 1: signrnd=np.random.choice([-1,1],size=len(ds.fit)) if ds.datatype == 'cpmg': if heteroskedacity == 1: errlist=np.array([np.sqrt(np.average(np.array(xx)**2)) for xx in ds.rcpmgerr]) else: errlist=1 ds.resid=signrnd*(ds.rcpmg-ds.fit)/errlist elif ds.datatype == 'cest': if heteroskedacity == 1: errlist=(ds.ymax-ds.ymin)/2 else: errlist=1 ds.resid=signrnd*(ds.y-ds.fit)/errlist elif ds.datatype == 'Rex': if heteroskedacity == 1: errlist=[np.average([ds.yerr1,ds.yerr2]),np.average([ds.yerr1,ds.yerr2])] else: errlist=1 ds.resid=signrnd*(ds.yval-ds.fit)/np.array(errlist) # print ds.resid # print np.average(ds.resid), np.std(ds.resid) dsref[ds.datatype].append(ds.setselect) resid[ds.datatype].append(ds.resid) for dt in datatypes: allresid[dt].append(resid[dt]) alldsref[dt].append(dsref[dt]) dplcoll.append(dpl) dspool={} sspool={} """for each data point, a random residual from a set of eligible residuals is selected. This ways, the residuals for a certain group of data points are mixed (with replacement). method "dataset": include all points belonging to a certain datatype and dataset (across different residues) method "spinsyst": include all points belonging to a certain data type and residue (across different datasets) method "any": include all points belonging to a certain data type (any dataset and any spin system) method "each": include all points belonging to a certain dataset, data type and residue """ for dt,method in shuffletype: dspool[dt]=[[] for i in np.arange(dsnum+1)] sspool[dt]=[] for xx,x in enumerate(allresid[dt]): for z,y in enumerate(x): if 'y' != []: dspool[dt][alldsref[dt][xx][z]].append(y) sspool[dt].append(flatten(x)) dspool[dt]=[flatten(i) for i in dspool[dt]] for q,x in enumerate(allresid[dt]): for z,y in enumerate(x): if method == 'dataset': pool=dspool[dt][alldsref[dt][q][z]] elif method == 'spinsyst': pool=sspool[dt][q] elif method == 'any': pool=flatten(allresid[dt]) elif method == 'each': pool=allresid[dt][q][z] np.random.seed() if pool != []: if withoutreplacement == 0: """with replacement works always""" chosenwere=np.random.randint(len(pool),size=len(allresid[dt][q][z])) """without replacement is not implemented for the 'any' method at this time.""" else: chosenwere=np.random.choice(np.arange(len(pool)),size=len(allresid[dt][q][z]),replace=False) allresid[dt][q][z]=np.array([pool[i] for i in chosenwere]) for delthis in np.sort(chosenwere)[::-1]: try: del(pool[delthis]) except: np.delete(pool,delthis) #z=0 """calculate resampled data points, used as experimental data for error estimation""" for dt in datatypes: q=0 for ssn,spinsyst in enumerate(ss): selnam = spinsyst.name[0] if selnam in reslalmall: z=0 for dsn,ds in enumerate(spinsyst.datasets): if ds.datatype == dt: if heteroskedacity == 1: if ds.datatype == 'cpmg': errlist=np.array([np.sqrt(np.average(np.array(xx)**2)) for xx in ds.rcpmgerr]) elif ds.datatype == 'cest': errlist=(ds.ymax-ds.ymin)/2 elif ds.datatype == 'Rex': errlist=np.array([np.average([ds.yerr1,ds.yerr2]),np.average([ds.yerr1,ds.yerr2])]) else: errlist = 1 print ds.fit, allresid[dt][q][z], errlist, ds.fit+allresid[dt][q][z]*errlist, 'reshuffled' ss[ssn].datasets[dsn].reshufy=ds.fit+allresid[dt][q][z]*errlist if ds.datatype == 'Rex': ss[ssn].datasets[dsn].reshufy=[ss[ssn].datasets[dsn].reshufy[0] for i in np.arange(len(ss[ssn].datasets[dsn].reshufy))] z+=1 q+=1 return ss
def fitcpmg4(praxs1,timedat,rawdata,field,err,mode,equationtype,conditions,savstatdir,filenamsav,resnam,poscoll,moreconditions,paramsx,fl): """ Fitting multiple types of relaxation dispersion data (function title somewhat misleading). input arguments: praxs1: property axis collection; timedat: x axis data; rawdata: \ y axis data; field: B0 field relative to 500 MHz, err: y error, mode: not used here, \ was used in the parent script as precalc; equationtype: type of relaxation dispersion, \ see multifunctg2 function for details; conditions: list of conditions: pos 0 - reshuffle, \ this is not used anymore, but not entirely deleted yet (check). pos 1 - details \ about many steps and attempts should be made for fitting. pos 1/0: number of initial \ attempts of the "precalculation" round to get to a small chi square at the beginning \ of the fitting procedure. High number can be useful when starting in a Monte Carlo \ manner at the beginning of the project (large boundaries, little idea about the system); pos 1/1 precalcatt: Number of documented major calculation steps during the precalculation; \ pos 1/2 precalclen: Number of undocumented steps within each major precalculation step; \ pos 1/3 maincalcatt: Number of documented major calculation steps during the main calculation;\ pos 1/4 maincalclen: Number of undocumented steps within each major main calculation step. filenamsav: Name under which progess of fit is saved to disk resnam: list of residue names included in fit (not used here) poscoll: (not used anymore, has to do with being able to undo some sorting action) moreconditions: package of conditions used by parent script, this is supposed to be saved with\ the status and therefore an input argument. paramsx: collection of parameter objects, can have different bounds etc fl: list of pointer lists. Each list member point to certain parameter list positions, \ there is one pointer list for each data point. """ allconditions=[timedat,rawdata,field,err,mode,equationtype,conditions,\ filenamsav,resnam,poscoll,moreconditions] reshuffle=conditions[0] #0 or 1 numbattempts,precalcatt,precalclen,maincalcatt,maincalclen=conditions[1][0:5]#1, 5, 5, 5, 10 numdat=np.shape(rawdata)[0] dwbsetp=[] setdwb=0 for i in np.arange(len(timedat)): for j in np.arange(len(flatten(timedat[i]))): dwbsetp.append(setdwb) setdwb+=1 par6=np.array(field) gpar=np.array(equationtype) par7=np.array(np.array(dwbsetp).astype('int')) par2=np.array(timedat) par3=np.array(rawdata) errvalpar=np.array(err) # print len(boundsl) if reshuffle == 1: par7x=par7;par6x=par6;par2x=par2;par3x=par3;errvalparx=errvalpar;gparx=gpar par7=[];par6=[];par2=[];par3=[];errvalpar=[];gpar=[] for i,j in enumerate(par2x): if gparx[i] == 6: par2.append(par2x[i]) par3.append(par3x[i]) par6.append(par6x[i]) par7.append(par7x[i]) errvalpar.append(errvalparx[i]) gpar.append(gparx[i]) else: foundrnd=0 np.random.seed() while foundrnd == 0: np.random.seed() n=np.random.randint(len(par2x)) if gparx[n] == 3: par2.append(par2x[n]) par3.append(par3x[n]) par6.append(par6x[n]) par7.append(par7x[n]) errvalpar.append(errvalparx[n]) gpar.append(gparx[n]) foundrnd=1 allrescoll=[] par1coll=[] costcoll=[] for u in np.arange(numbattempts): np.random.seed() evalmode=1 a,b1,b2,c,e=paramsx.getallparandbnds(praxs1,['p','k','dw','R20500','R2mult'],inclfilt=[]) par1=a[u];boundsl=b1[u];boundsh=b2[u] if evalmode == 1: for i in zip(par1,boundsl,boundsh): if i[0] <= i[1] or i[0] >= i[2]: print 'problem!', i[0], i[1], i[2] if precalcatt > 0: try: for k in np.arange(precalcatt): if evalmode ==1: for i in zip(par1,boundsl,boundsh): print i[0],i[1],i[2] # print par1,par2, par3, par6, par7, errvalpar, gpar, fl # print 'here' # print gpar, 'gparold' gparn=[] for ggg in gpar: gparn.append(list([gggg+1000 for gggg in ggg])) #gpar=np.array([ggg+1000 for ggg in gpar]) # print gparn, 'gparnew' res=optimize.least_squares(errfunctg3,par1,max_nfev=precalclen,\ bounds=(boundsl,boundsh),args=(par2,par3,par6,par7,\ errvalpar,gparn,fl),method='trf',jac='3-point',x_scale='jac') #, par1=res.x allrescoll.append(res) print 'attempt ', u, ' precalculation step ', u, k, par1,res.cost,filenamsav, allrescoll[-1].cost hkio.savstatus2b(savstatdir,filenamsav,resnam,poscoll,allrescoll,allconditions) except: print 'well this one didnt work' par1coll.append(res.x) costcoll.append(res.cost) if precalcatt > 0: try: par1=par1coll[np.argmin(costcoll)] except: print "unfeasable result" # try: for k in np.arange(maincalcatt): print len(par1), len(par2), len(flatten(par2)), 'well' res=optimize.least_squares(errfunctg3,par1,max_nfev=maincalclen,\ bounds=(boundsl,boundsh),args=(par2,par3,par6,par7,errvalpar,gpar,\ fl),method='trf',jac='3-point',x_scale='jac') allrescoll.append(res) hkio.savstatus2b(savstatdir,filenamsav,resnam,poscoll,allrescoll,allconditions) par1=res.x print 'mainalculation step ', u, k, par1, res.cost,filenamsav, allrescoll[-1].cost for i in res.x: print i print 'final cost', res.cost # except: # res=[0] # print 'late fitting error' return res, allrescoll #xtol=1e-9
def multifunctg2(params,x,m,dwbset,g): """performs fitting for a variety of relaxation dispersion experiments. Beause r20 of a higher field >= r20 of a lower field, r20 is constructed by combining 1-2 parameters: R20 @ 500, and a scaling factor >= 1 for R20 @ B0 > 500. input variables: params (parameters); x: x values (1/(4*tau_cp) or offset); m=magnetic field B0 relative to 500 MHz, to be multiplied with delta_omegas (dwx, dcx) delta_omega refers to the chemical shift of the minor state from the major state in s-1 at 500 MHz. dwbset: not used anymore. g: type of experiment. 3: CPMG; 6: Rex. Can chose whether to calculate via pseudo- CPMG data or r1rho/free precession; 5: R1rho (not used); > 10: CEST (g is than equal to the B1 field in Hz). Hard-coded: R1=1.8 (arbitrary, virtually doesn't really change CEST curves); duration of CEST experiment: 400 ms. Possible Improvements: might reconsider hard-coded R1 at revision stage. might test more extensively whether it better to use a psuedo-CPMG model or a r1rho/free model to calculate Rex. """ pbx,pcx,kexx,k13x,k23x=params[0:5] try: r20=params[7]*params[8] except: r20=params[7] result=[] g=g[0] if g > 1000: calcchoice = 0 g=g-1000 else: calcchoice = 1 if g == 3: tx=1/(4*np.array(x)); #tau_cp for b,a in enumerate(tx): dwx=m[b]*params[5] dcx=m[b]*params[6] """ for step 5 and 6: exact calculation """ result.append(hkRDmath.nEVapprox(a,dwx,pbx,kexx,2,0,0,0,dcx,pcx,k13x,k23x,0,0,0,0,0)+r20) """ for steps 1-4: Exp0Log2Lamda2 approximation from Koss, H., Rance, M., & Palmer, A. G., 3rd. (2018). General \ Expressions for Carr-Purcell-Meiboom-Gill Relaxation Dispersion \ for N-Site Chemical Exchange. Biochemistry, 57(31), 4753-4763.""" #result.append(hkRDmath.nEVapprox(a,dwx,pbx,kexx,2,0,2,2,dcx,pcx,k13x,k23x,0,0,0,0,0)+r20) elif g == 6: # tx=1/(np.array(x)) #tx is exact duration of tau in tau-pi-tau-tau-pi-tau #print x,tx, 'tx2' wx=np.array(x)*(np.sqrt(3))/(2*np.pi) for b,a in enumerate(tx): dwx=m[b]*params[5] dcx=m[b]*params[6] """Two distinct ways to calculate 15N-Rex: The low-power data point is always calculated as CPMG-like. The high-power point is calculated either as originating from a CPMG-like experiment, or from an R1rho-like experiment. The R1rho-like calculation is a little bit more accurate but less stable (eigenvalue calculation can give extreme results). For Step 1-3, the CPMG-like calculation is preferred, then R1rho- like """ # rex1old=hkRDmath.r1req(0,dwx,pbx,kexx,2,4*wx[1]*np.pi*2,0,0,dcx,pcx,k13x,k23x,0,0,0,0,0) rex1=hkRDmath.r1req(0,dwx,pbx,kexx,2,wx[1]*np.pi*2,0,0,dcx,pcx,k13x,k23x,0,0,0,0,0) cpmg1=hkRDmath.nEVapprox(tx[1],dwx,pbx,kexx,2,0,0,0,dcx,pcx,k13x,k23x,0,0,0,0,0) cpmg2=hkRDmath.nEVapprox(tx[0],dwx,pbx,kexx,2,0,0,0,dcx,pcx,k13x,k23x,0,0,0,0,0) # viarexold=cpmg2-rex1old viarex=cpmg2-rex1 viacpmg=cpmg2-cpmg1#hkRDmath.nEVapprox(tx[0],dwx,pbx,kexx,2,0,0,0,dcx,pcx,k13x,k23x,0,0,0,0,0)-hkRDmath.nEVapprox(tx[1],dwx,pbx,kexx,2,0,0,0,dcx,pcx,k13x,k23x,0,0,0,0,0) if calcchoice == 1: result.append(viarex) else: result.append(viacpmg) elif g > 10: tx=np.array(x) dwx=params[5] dcx=params[6] for b,a in enumerate(tx): """R1 has a very small influence on the CEST calculation. We have simulated an average R1 for the C11 monomer at 700 and 800 MHz which is selected here. """ if m[b] < 0.15: r1set=0.966 else: r1set=1.144 #was 1.8 in test phase datapointx=hkRDmath.cestfunction([a],dwx,dcx,kexx,k13x,k23x,pbx,pcx,g,r1set,r20,m[b],1) result.append(datapointx) return result
def printrd(praxs1,x,exp_data,m,err,g,dwbset,par,fl): """This function recalculates the theoretical value and chi square from parameters and experimental data.""" value=[];valuealt=[] chicoll=[] chicoll2=[] a,b1,b2,c,e=par.getallparandbnds(praxs1,['p','k','dw','R20500','R2mult'],inclfilt=[]) par1=a[0] parnocoll=[] for l,i in enumerate(fl): thisset=[j for j in i] parnocoll.append(thisset) paramno=len(set(flatten(parnocoll))) cestchi=[];rdchi=[] xalt=[list(np.arange(np.min(xi),np.max(xi),20)) for xi in x] for l,i in enumerate(fl): a=[par1[j] for j in i] thisset=[j for j in i] val=np.array((flatten([multifunctg2(a,x[l],m[l],dwbset[l],g[l])]))) """ comment in the following line if a higher resolution CPMG curve is required""" if fineres == 1: #print len([m[l][0] for xa in xalt]), len([g[l][0] for xa in xalt]), len(xalt[l]) valalt=np.array((flatten([multifunctg2(a,xalt[l],[m[l][0] for xa in xalt[l]],dwbset[l],[g[l][0] for xa in xalt[l]])]))) valuealt.append(valalt) value.append(val) if g[l][0] > 10: cestchi.append((np.array(exp_data[l])-np.array(val))/(np.array(err[l]))) else: rdchi.append((np.array(exp_data[l])-np.array(val))/(np.array(err[l]))) chicoll2.append((np.array(exp_data[l])-np.array(val))/(np.array(err[l]))) chicoll.append(np.sum(((np.array(exp_data[l])-np.array(val))/\ (np.array(err[l])*np.sqrt(len(exp_data[l])-paramno*\ (len(exp_data[l])/len(flatten(exp_data))))))**2)) paramno=len(set(flatten(parnocoll))) err0=np.array(flatten(exp_data))-np.array(flatten(value)) x1=((len(flatten(cestchi)))/(len(flatten(cestchi))+len(flatten(rdchi)))) x2=np.sum((np.sqrt(2)*np.array(flatten(cestchi))*(1/np.sqrt(len(flatten(exp_data))-paramno)))**2)/2 x3=((len(flatten(rdchi)))/(len(flatten(cestchi))+len(flatten(rdchi)))) x4=np.sum((np.sqrt(2)*np.array(flatten(rdchi))*(1/np.sqrt(len(flatten(exp_data))-paramno)))**2)/2 if fineres == 1: value=valuealt #print value, 'value' return value, chicoll,chicoll2, (x4*x1+x3*x2)*2,(np.array(flatten(err0))/\ (np.array(flatten(err))*np.sqrt(len(flatten(exp_data))-paramno)))