Skip to content

Long running PHP PubSub process that eats all cpu exactly 1h into processing #2338

@udAL

Description

@udAL

This is more of a plea for help than a bug report..

  • OS: Dockerized Ubuntu 16.04.6 (Modified gcr.io/google-appengine/php72)
  • PHP version: 7.2.22
  • Package name and version:
    google/cloud-pubsub 1.11.0
    google/cloud-core 1.27.0
    (outdated because of compatibility reasons)

We have been using PubSub for a while for managing background tasks. We have a GCE vm that has supervisord with two long running php processes that check for new messages and process whatever. It looks something like this:

use Google\Cloud\PubSub\PubSubClient;
$pubSub = new PubSubClient([
    'keyFilePath' => 'includes/secret.json'
]);
$subscription = $pubSub->subscription('workers');
while(true)
{
    $message = false;
    try
    {
        foreach ($subscription->pull([
            'maxMessages' => 1
        ]) as $pullMessage)
        {
            $message = $pullMessage;
        }
    }
    catch( Exception $e)
    {
        // Nothing
    }
    if($message)
    {
        // Do stuff
        $subscription->acknowledge($message);
    }
}

As discussed on #1986, since we started we saw this errors popping up every hour:

[19-Sep-2019 16:30:30 Europe/Madrid] PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: cURL error 35: gnutls_handshake()$
Stack trace:
#0 /app/vendor/google/cloud-core/src/RequestWrapper.php(189): Google\Cloud\Core\RequestWrapper->convertToGoogleException(Object(GuzzleHttp\Exce$
#1 /app/vendor/google/cloud-core/src/RestTrait.php(95): Google\Cloud\Core\RequestWrapper->send(Object(GuzzleHttp\Psr7\Request), Array)
#2 /app/vendor/google/cloud-pubsub/src/Connection/Rest.php(193): Google\Cloud\PubSub\Connection\Rest->send('subscriptions', 'pull', Array)
#3 /app/vendor/google/cloud-pubsub/src/Subscription.php(409): Google\Cloud\PubSub\Connection\Rest->pull(Array)
#4 /app/worker.php(77): Google\Cloud\PubSub\Subscription->pull(Array)
#5 {main}
  thrown in /app/vendor/google/cloud-core/src/RequestWrapper.php on line 336

It wasn't a big deal for us since we had our supervisord to keep the processes open. It wasn't till a few days ago when we updated our docker (which we do frequently). Suddenly our worker vm's were constantly at 100% cpu, even though there wasn't anything to process on PubSub and no task processing. We could restart the processes and they would work fine for exactly one hour, then they'll pop to 100% cpu without apparent reason. We're stuck with an essential server that needs restarting every 60 min.

We have done a lot of debugging and the only difference we found between the old and new machines it's that ubuntu package libcurl3-gnutls was updated from <7.47.0-1ubuntu2.12> to <7.47.0-1ubuntu2.14>. We haven't found a way to downgrade.

We have tried to force Guzzle to use TLS 1.1. PubSub allows to pass parameters to it like this:

$subscription->pull([
    'maxMessages' => 1,
    'restOptions' => [
        'curl' => [
            CURLOPT_SSLVERSION => CURL_SSLVERSION_TLSv1_0
        ]
    ]
])

But it hasn't work... Not with CURL_SSLVERSION_TLSv1 neither with CURL_SSLVERSION_TLSv1_0 or CURL_SSLVERSION_TLSv1_2
I know this is a complex error, probably has nothing to do with google/cloud-pubsub, and this is an outdated version. Maybe it's a TLS incompatibility with google pubsub service. Maybe an outdated library in an updated os. I really don't know, but we'll take any suggestion you can give us... Any ideas?

Tomorrow i'll try with a clean project with updated packages and will post.

Metadata

Metadata

Labels

api: pubsubIssues related to the Pub/Sub API.externalThis issue is blocked on a bug with the actual product.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions