I have a set of code that calculates how well a model fits the details of a user. It is a bit long, so I cut all the code that does its job well and I only included the neck of the bottle and the surrounding code. My goal here is speed!
A few notes on the variables:
Calculated at runtime (changes each user):
 TopSecHead – channel list list – different sublist lengths
 allSecScoreDicP1 – float list dict – always the same length
 allSecScoreDicP2 – dict of the float list – always the same length
 allSecReducerDicP1 – float list dict – always the same length
 allSecReducerDicP2 – dict of the float list – always the same length
The last 4 above are all the same dimensions from each other
Calculated once and saved / loaded (does not change per user):

docSecSizesFull – float list list list – sublists do not have the same length

shortSecSizesFull – float list list – sublists do not have the same length

cutPointsFull – float list list list – sublists are not the same length

tmplNumFull – float list list list – sublists do not have the same length

AllDocSplitsFull – float list list – sublists do not have the same length
The above five are the same dimensions as each other.
for x in range(4,8,1):
docSecSizes = docSecSizesFull(x4)
shortSecSizes = shortSecSizesFull(x4)
cutPoints = cutPointsFull(x4)
tmpltNum = tmplNumFull(x4)
layoutNums = 0
numTemps = len(docSecSizes)
tmpsplits = ()
tmpsplits = (AllDocSplitsFull(x4)(z) for z in range(numTemps))
alltmplIds = (tmplNumFull(x4)(z) for z in range(numTemps))
for y in list(itertools.permutations(TopSecHead(x4)(1:))):
tmpHeadSec = ()
tmpHeadSec.append('BasicInfo')
headingIDs = ()
headingIDs.append(str(0))
for z in y:
tmpHeadSec.append(z)
headingIDs.append(str(headingLookups.index(z)))
SectionIDs = ','.join(headingIDs)
tmpvals = ()
tmpArray = ()
for key in allSecScoreDicP1:
tmpArray.append(allSecScoreDicP1(key))
nparr = np.array(tmpArray)
print(nparr.transpose())
for z in range(numTemps):
docScore = 0
docScoreReducer = 1
for q in range(len(shortSecSizes(z))):
if q < cutPoints(z):
indexVal = shortSecSizes(z)(q)
docScore+= allSecScoreDicP1(tmpHeadSec(q))(indexVal)
docScoreReducer *= allSecReducerDicP1(tmpHeadSec(q))(indexVal)
else:
indexVal = shortSecSizes(z)(q)
docScore+= allSecScoreDicP2(tmpHeadSec(q))(indexVal)
docScoreReducer *= allSecReducerDicP2(tmpHeadSec(q))(indexVal)
docScore = docScore * docScoreReducer
tmpvals.append(docScore)
numTemplate = len(tmpvals)
totaldocs += numTemplate
sectionNum = (x) * numTemplate
layoutNumIterable = (layoutNums) * numTemplate
SectionIDsIterable = (SectionIDs) * numTemplate
scoredTemplates.append(pd.DataFrame(list(zip(sectionNum,alltmplIds,layoutNumIterable,tmpvals,SectionIDsIterable,tmpsplits)),columns = ('#Sections','TemplateID','LayoutID','Score','SectionIDs','Splits')))
layoutNums +=1
allScoredTemplates = pd.concat(scoredTemplates,ignore_index=True)
The problem code is this bit:
for z in range(numTemps):
docScore = 0
docScoreReducer = 1
for q in range(len(shortSecSizes(z))):
if q < cutPoints(z):
indexVal = shortSecSizes(z)(q)
docScore+= allSecScoreDicP1(tmpHeadSec(q))(indexVal)
docScoreReducer *= allSecReducerDicP1(tmpHeadSec(q))(indexVal)
else:
indexVal = shortSecSizes(z)(q)
docScore+= allSecScoreDicP2(tmpHeadSec(q))(indexVal)
docScoreReducer *= allSecReducerDicP2(tmpHeadSec(q))(indexVal)
docScore = docScore * docScoreReducer
tmpvals.append(docScore)
I tried to change it to understand the list but it was slower:
docScore = (sum((allSecScoreDicP1(tmpHeadSec(q))(shortSecSizes(z)(q)) if q < cutPoints(z) else allSecScoreDicP2(tmpHeadSec(q))(shortSecSizes(z)(q)) for q in range(len(shortSecSizes(z))))) for z in range(numTemps))
docReducer = (np.prod((allSecReducerDicP1(tmpHeadSec(q))(shortSecSizes(z)(q)) if q < cutPoints(z) else allSecReducerDicP2(tmpHeadSec(q))(shortSecSizes(z)(q)) for q in range(len(shortSecSizes(z))))) for z in range(numTemps))
tmpvals = (docScore(x) * docReducer(x) for x in range(len(docScore)))
Any suggestion on optimization methods would be greatly appreciated. I also tried to convert the code to cython, I snapped it and worked, but it was about 10 times slower!