-
Notifications
You must be signed in to change notification settings - Fork 340
Description
- Operating System version: n/a
- Firebase SDK version: firebase-admin==5.0.2
- Firebase Product: auth, user management
- Python version: 3.8
- Pip version: n/a
[REQUIRED] Step 3: Describe the problem
When fetching all user accounts using the firebase_admin.auth.list_users().iterate_all() helper, performance is very poor, taking about 5 seconds to fetch 1000 users.
I have a project which has several million users. There is a scheduled task that fetches all the users for processing, but the task takes way too long and runs into problems with App Engine request timeouts.
Steps to reproduce:
Here is an App Engine standard + Python 3.8 project which has 2 routes to show the difference between iterating through all users using the iterate all helper, versus fetching each page directly:
https://gist.github.com/davidwtbuxton/c6bbdc1e93f686cae931d1e38b5bd555
The code logs a message for every 1000 users. In my testing it takes about 5 seconds for every 1000 users on the slow path (using iterate_all) and about 1 second for every 1000 users on the fast path (not using iterate_all).
Relevant Code:
I've got a WIP patch here davidwtbuxton@20991aa
I need to test that WIP patch some more, but would this be useful? Suggestions for improving the patch?
I think the cause of the poor performance is the PageIterator class and how next() is implemented. In the next() method, the code evaluates self.items
3 times. For the _UserIterator class the items list is actually a computed property, which accesses ListUsersPage.users, which is also a computed property that builds a new list of user records.
So for each user in the list of users, the iterator ends up triggering a list of all users in the current page to be constructed 3 times, just to access 1 user. When a page has 1000 users, this ends up being much slower (1000 x 3) than the ideal case of constructing the list just 1 time for each page.