Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to acceess sapi_request_info during profiling? #403

Open
lighthuter opened this issue Jan 8, 2024 · 12 comments
Open

How to acceess sapi_request_info during profiling? #403

lighthuter opened this issue Jan 8, 2024 · 12 comments

Comments

@lighthuter
Copy link

lighthuter commented Jan 8, 2024

I want to access request query params during profiling to implement trigger functionality similar to the xDebug one. I'm trying to access sapi_request_info.query_string to do this. I tried to create new "intptr_t" and "char" types and combine them with Pointer::indexedAt(), but as a result I get some random characters

Example

$zend_type_reader = $this->getTypeReader($php_version);
        [$offset, $size] = $zend_type_reader->getOffsetAndSizeOfMember(
            'sapi_request_info',
            'query_string'
        );

        $pointer = new Pointer(
            RawPointer::class, // intptr_t
            $sg_address + $offset,
            $size // 8
        );

        $charPtr = $dereferencer->deref($pointer)->value;

        $pointer = new Pointer(
            RawPointer::class,
            $sg_address + $offset,
            $size // 8
        );

        $charPtr = $dereferencer->deref($pointer)->value;

        $pointer = new Pointer(
            Char::class, // char
            (int) $charPtr,
            $size // 4
        );

        $firstChar = $dereferencer->deref($pointer)->value; // some random character

Is it possible to do something like this?

@sj-i
Copy link
Member

sj-i commented Jan 9, 2024

@lighthuter
Hi!

In that code you first dereferences sapi_request_info.query_string as a pointer to another pointer, and then dereferences the retrieved pointer as a pointer to Char. But actually sapi_request_info.query_string is char *, not char **.

From what I see, you don't need the RawPointer in this case. You should get the correct value by dereferencing sapi_request_info.query_string as a pointer to a 1 byte Char. Then dereferencing $pointer->indexedAt(1) will bring you the next byte, if it has not yet been terminated by a nul character.

@lighthuter
Copy link
Author

lighthuter commented Jan 9, 2024

Hey @sj-i!

Thanks for your answer! I tried to do what you explained, but it seems I'm understanding your explanation wrong. Could you please point to what I'm doing wrong here?

      $zend_type_reader = $this->getTypeReader($php_version);
      [$offset, $size] = $zend_type_reader->getOffsetAndSizeOfMember(
          'sapi_request_info',
          'query_string'
      );

      $pointer = new Pointer(
          Char::class, // char
          $sg_address + $offset,
          $size // 8
      );

      $charArray = [];
      $index = 1; // start from 1

      while (true) {
          $pointer = $pointer->indexedAt($index++);
          $charValue = $dereferencer->deref($pointer)->value;

          if ($charValue === "\x00") {
              break;
          }

          $charArray[] = $charValue;
      }

      $string = implode($charArray); // b"°ëÀ"

Char.php

final class Char implements Dereferencable
{
    public $value;

    /** @param CastedCData<CInteger> $casted_cdata */
    public function __construct(
        private CastedCData $casted_cdata,
        private Pointer $pointer,
    ) {
        $this->value = $this->casted_cdata->casted->cdata;
    }

    public static function getCTypeName(): string
    {
        return 'char';
    }

    public static function fromCastedCData(
        CastedCData $casted_cdata,
        Pointer $pointer
    ): static {
        /** @var CastedCData<CInteger> $casted_cdata */
        return new self($casted_cdata, $pointer);
    }

    public function getPointer(): Pointer
    {
        return $this->pointer;
    }
}

As a result I would get something like b"°ëÀ" which can not be translated to meaningful text AFAIK

@sj-i
Copy link
Member

sj-i commented Jan 10, 2024

@lighthuter
Hi!

Oh, I have overlooked a number of things. You were right about trying to get the pointer value as a 64-bit value first.

Currently maybe you use the address of SG as an address of SG(request_info).

The offset to the query_string field is $sg_address + $request_info_offset + $query_string_offset, like this.

[$request_info_offset, ] = $zend_type_reader->getOffsetAndSizeOfMember(
    'sapi_globals_struct',
    'request_info'
);

[$query_string_offset, ] = $zend_type_reader->getOffsetAndSizeOfMember(
    'sapi_request_info',
    'query_string'
);

$query_string_pointer_pointer = new Pointer(
    RawInt64::class,
    $sg_address + $request_info_offset + $query_string_offset,
    8
);

$query_string_pointer = new Pointer(
    Char::class,
    $dereferencer->deref($query_string_pointer_pointer)->value,
    1,
);

$index = 1; // start from 1

This must start from 0.

$pointer = $pointer->indexedAt($index++);

You should not rewrite the $pointer, otherwise the address would be moved out by an extra 1 for each loop.
Maybe this should work.

$pointer = $query_string_pointer->indexedAt($index++);

I haven't been able to check it properly due to machine trouble, but if it doesn't work, please let me know again!

@lighthuter
Copy link
Author

Thank you a lot, it worked!

@lighthuter
Copy link
Author

@sj-i I have another question if you don't mind :) Could you please give a suggestion on how to access $_SERVER values? Basically I'm interested in request URL, but since request_uri does not contain it I figured I could try to get it from $_SERVER. Thank you.

@sj-i
Copy link
Member

sj-i commented Jan 16, 2024

@lighthuter
Hi! Sorry for the delay. I was busy to death.

You can read the super globals via EG(symbol_table).
So first getting EG like this and then reading the global variables table using ZendArray::findByKey would work.

The code would be like this.

$server_zval = $eg->symbol_table->findByKey($dereferencer, '_SERVER')?->val->value->arr;
if (!is_null($server_zval)) {
    $server_array = $dereferencer->deref($server_zval);
    $request_uri_pointer = $server_array->findByKey($dereferencer, 'REQUEST_URI')?->val->value->str;
    if (!is_null($request_uri_zval)) {
        $request_uri_zend_string = $dereferencer->deref($request_uri_zval);
        $request_uri = $request_uri_zend_string->toString($dereferencer, $request_uri_zend_string->len);
    }
}

BTW, you can read $_SERVER['QUERY_STRING'] in the same way. I'm not sure which is faster though.

@lighthuter
Copy link
Author

Hi, @sj-i ! No worries, I really appreciate your help! I tried to use the code you provided but unfortunately, I get an error
#message: "failed to read memory. target_pid=5370, remote_address=0x7fe694bcfe00, errno=14" when trying to deref server_array ($server_array = $dereferencer->deref($server_zval);).

@sj-i
Copy link
Member

sj-i commented Jan 18, 2024

@lighthuter
Oh, my bad.

$eg->symbol_table->findByKey($dereferencer, '_SERVER')?->val->value->arr;

This code pulls a Bucket from the hash table, but as it is only used as a temporary object, the Bucket object is immediately freed with the corresponding internal \FFI\CData object. So the reference to the zval causes an use-after-free here...

We have to keep the reference to the Bucket until we pull its value.

$server_bucket = $eg->symbol_table->findByKey($dereferencer, '_SERVER');
if (!is_null($server_bucket)) {
    $server_array_pointer = $server_bucket->val->value->arr;
    $server_array = $dereferencer->deref($server_array_pointer);
    $request_uri_bucket = $server_array->findByKey($dereferencer, 'REQUEST_URI');
    if (!is_null($request_uri_bucket)) {
        $request_uri_pointer = $request_uri_bucket->val->value->str;
        $request_uri_zend_string = $dereferencer->deref($request_uri_pointer);
        $request_uri = $request_uri_zend_string->toString($dereferencer, $request_uri_zend_string->len);
    }
}

Also, I've found a bug in the implementation of ZendArray::findByKey and just fixed it on #413.
Please pull the latest 0.11.x or 0.12.x, and try the above code.

Be aware that $_SERVER is one of the auto globals in PHP. The variable isn't there until the target process first access to $_SERVER somewhere in a script at each request.

@lighthuter
Copy link
Author

@sj-i , thanks, this works! I have a question though. While debugging my changes, I've found out that there are a ton of errors like "failed to read memory. target_pid=6034, remote_address=0xd1, errno=14 when profiling using i:daemon. They are coming from the following lines in CallTraceReader:

  • 159 $class_name = $current_function->getClassName($cached_dereferencer) ?? '';
  • 137 $current_execute_data = $dereferencer->deref($current_execute_data->prev_execute_data);
  • 143 $function_name = $current_execute_data->getFunctionName(

Is this expected?

@sj-i
Copy link
Member

sj-i commented Jan 20, 2024

@lighthuter Hi!

If you use Reli without stopping the target process via -S option, it's a normal behavior.

The state of the target process continues to change while Reli is reading its memory, which can naturally lead to inconsistencies in the memory contents being read.

Reli is a follower of phpspy. And phpspy itself is a follower of rbspy. This same behavior is the default of both phpspy and old rbspy.

But honestly, I rarely use Reli without -S option these days. There is a tool called py-spy, another follower of rbspy. Both the newer version of rbspy and py-spy stop target processes by default. So, I may change the default for Reli to stop target in the near future.

@lighthuter
Copy link
Author

@sj-i I see, thanks for the explanation! I tried with -S option though, and errors are still there in the same amount. Am I doing something wrong?

@sj-i
Copy link
Member

sj-i commented Jan 22, 2024

@lighthuter Hmm, then there should not be so much. Is it possible to give me simple reproduction steps? If so, please open another issue as a bug report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants