AWS Lambda Python: Forcing the Lambda Sandbox to Timeout

Brian Olson
3 min readJan 22, 2022

I recently dug into a service that wasn’t performing well and at its root found a quirk of the lambda platform. We were experiencing unexpected timeouts and delays in our lambda function executions.

For background the Lambda function has a significant amount of init cost — initializing dependencies, a fairly large code package, and connecting to an ENI for VPC access.

We use Lambda Provisioned Concurrency to mitigate that impact of our long inits. And typically this works fine, but every once in a while we’d see Lambda functions with very long executions.

Let’s setup an experiment with a simple python lambda function

import time
import datetime
runCount = 0def lambda_handler(event, context):
global runCount
current_dt = datetime.datetime.now()
print("Start of function: ", current_dt)
if runCount == 0:
current_dt = datetime.datetime.now()
print("I need to init: ", current_dt)
elif runCount < 3:
print("I don't need to init, just running: ", runCount)
else:
time.sleep(15)
runCount = runCount + 1current_dt = datetime.datetime.now()
print("end of function: ", current_dt)
return;

This function will “init” once (meaning it will print init), run normally twice, and then sleep for 15 seconds.

The lambda function has a timeout of 3 seconds

So this will throw an error. The question is if the lambda function times out, does the Lambda sandbox restart and clear the static context — in this case the runCount variable. Let’s try it!

First execution we get

Start of function:  2021-12-17 17:07:20.445407
I need to init: 2021-12-17 17:07:20.445504
end of function: 2021-12-17 17:07:20.445517

Second and third executions we get

Start of function:  2021-12-17 17:07:52.900683
I don't need to init, just running: 1
end of function: 2021-12-17 17:07:52.900751
...
Start of function: 2021-12-17 17:08:07.818484
I don't need to init, just running: 2
end of function: 2021-12-17 17:08:07.818551

And on the 4th execution we get

"errorMessage": "2021-12-17T17:08:18.171Z 5c81669f-1fc2-45ca-b11d-0cdfe225ac72 Task timed out after 3.00 seconds"

And on the 5th execution we get

Start of function:  2021-12-17 17:08:40.696788
I need to init: 2021-12-17 17:08:40.696861
end of function: 2021-12-17 17:08:40.696873

Which means we’ve cleared the global runCount variable. Which is really interesting! Let’s try the same thing with a slightly different.

import time
import datetime
runCount = 0def lambda_handler(event, context):
global runCount
current_dt = datetime.datetime.now()
print("Start of function: ", current_dt)
if runCount == 0:
current_dt = datetime.datetime.now()
print("I need to init: ", current_dt)
elif runCount < 2:
print("I don't need to init, just running: ", runCount)
else:
raise Exception("Too many runs!")
runCount = runCount + 1current_dt = datetime.datetime.now()
print("end of function: ", current_dt)
return;

The first couple executions look normal

Start of function:  2021-12-17 17:13:31.610999
I need to init: 2021-12-17 17:13:31.611058
end of function: 2021-12-17 17:13:31.611070
...
Start of function: 2021-12-17 17:14:42.119543
I don't need to init, just running: 1
end of function: 2021-12-17 17:14:42.119606

But after that all of our executions look like this

Start of function:  2021-12-17 17:15:04.392925
[ERROR] Exception: Too many runs!
...
Start of function: 2021-12-17 17:15:24.882019
[ERROR] Exception: Too many runs!

So time outs kill the lambda sandbox, but exceptions don’t! Fascinating!

This is really good to be aware of. If you have expensive inits for your lambda function, and timeouts may not necessarily mean the function is completely dead you may want to consider extending the AWS Lambda function time out to let longer requests run.

--

--

Brian Olson

Engineer, formerly at Amazon, currently at Google. All opinions are my own. Consider supporting here: https://devblabs.medium.com/membership