TPTP feature: 196713

Author:

Stanislav Polevic

email:

stanislav.v.polevic@intel.com

Last updated:

 


Rough workload estimate in person weeks:

Process

Sizing

Names of people doing the work

Design

2

n/a

Code

5

Test

2

Documentation

1

Build and infrastructure

n/a

n/a

Code review & other committer work
(e.g. check-in, contribution tracking)
if this is to be contributed by someone who
is not a committer in the component

n/a

N/A - will be done by committer

Total

10

 

Requirement summary

We know the need to improve the profiler performance capabilities in the TPTP profiler. To provide a scalable and performant profiler that has a limited memory and processing footprint is important. This enhancement enables the profiler output and transfer in a non-XML from the profiler to the client.

Extension points

To add new message exploiter should:

User interactions

Introducing new format needs to be compatible with previous versions. So, we need to support both XML and binary format.

User interface

Add new option 'Binary Transfer Format' which is unchecked by default. If user checks it on profiler is invoked with command line parameter 'format=binary'.

Command line option 'format' values:

Also, to ensure differentiation between XML and binary trace files upon import, binary trace files should have a .trcbin extension.

Compatibility

New binary format will not replace current XML format. It will be an optional feature invoked only upon user request.

To ensure backward compatibility client should inform profiler if it can accept binary stream via command channel. The profiler should transfer XML data if no such notification was done, even when 'format=binary' is used.

Also, some kind of negotiation protocol might be introduced.

Design summary

Current XML format has 45 events described in XML syntax. Each event has corresponding XML tag with a set of attributes.

Proposed format will have 45 customized event messages with unique message ID and complete set of attribute values in binary form. All attributes should be placed one by one in predefined order and unused attributes must be filled by default value (null).

Binary stream format

Binary stream consists of messages with predefined structure. Messages are classified into system and data.

ID ranges

ID range

Description

0-1000

System IDs

1001-32767

Data IDs

Binary file header

Data stream descriptor manifests the beginning of the new binary stream and describes basic features of the stream like endianness, format version etc.

The length of the header is fixed and it will not change to ensure backward compatibility. All other information will be sent via system messages.

The format between major versions must have stabilized structure.

Name

Length

Values

Magic indent

4 bytes

0TBF (Trace Binary Format)

Current format version

2 bytes

1st byte – major, 2nd byte – minor

Platform

1 byte

0 –32 bit, 1 – 64 bit

Endianness

1 byte

0 – big-endian, 1 – little-endian

Offset to data

4 bytes

Offset to real data

Character encoding message

This message specifies an encoding used in the format.

Name

Length

Values

ID

2 bytes

1

Encoding name

Variable

Encoding string like UTF8, KOI8-R etc

CPU max qualified frequency

To decrease performance drawback on the profiler side it’s possible to send timestamps as ticks. Client, being provided with frequency, can calculate timestamp by its own.

Name

Length

Values

ID

2 bytes

2

Frequency

8 bytes

From CPUID: Frequency * Multiplier

Binary message format

To allow effective parsing at the client side each message has a header consisting of unique message ID and message length.

Entry

Size

Notes

Message ID

2 bytes

Unique ID of the message

Message size

4 bytes

Message length in bytes

Message attributes

variable-length

Array of message attribute values in predefined order

Attribute data types

Type

Size

Default

Format (default)

integer

4 bytes

0

Little-endian

Long

8 bytes

0

Little-endian

double

8 bytes

0.0

ISO floating point format

string

variable-length

\0

UTF8 charset, terminated by \0

Example

Let's show the benefit by transforming widely used tag: <threadStart threadId="1" time="1185890426.304424453" objIdRef="1" threadName="Reference Handler" groupName="system" parentName=""/>, assuming this tag has unique ID of 32

XML tag, attribute

Size

Replacement

Data type

Size

Compression ratio

<threadStart

12 bytes

32

ID (short)

2 bytes

83%

n/a

n/a

Message size in bytes

Integer

4 bytes

n/a

 transientThreadId

0 bytes

0

Integer

4 bytes

n/a

 threadId="1"

12 bytes

1

Integer

4 bytes

66%

 time="1185890426.304424453"

28 bytes

1185890426.304424453

Double

8 bytes

71%

 transientObjIdRef

0 bytes

0

integer

4 bytes

n/a

 objIdRef="1"

13 bytes

1

integer

4 bytes

31%

 threadName="Reference Handler"

31 bytes

Reference Handler\0

string

18 bytes

42%

 groupName="system"

19 bytes

system\0

string

7 bytes

63%

 parentName=""

14 bytes

\0

string

1 byte

93%

 collationValue

0 bytes

0

integer

4 bytes

n/a

/>

2 bytes

n/a

n/a

0 bytes

100%

Total

131 bytes

 

60 bytes

54%

Note that by design binary message must have constant number and order of attributes. Attributes which are not included into XML message are in italic.

Resulting binary message takes 46% of original XML message.

Native format loader

By introducing custom binary format we also should provide custom Java loader at the client side. The loader should be aware of all custom event types.

The binary format parser works as follows:

Specific message handler works as follows:

Format messages

Current implementation of the TI profiler does not use all events enlisted in the public format description. Here is the list of events and their attributes which are emitted by TI profiler. Highlighted are those attributes, which are actually used.

Node (<node>)

Attribute

Type

Default value

Used

ID

Short

1001

Yes

Length

Integer

N/a

Yes

Node ID

String

NULL

Yes

Hostname

String

NULL

Yes

IP address

String

NULL

Yes

Time zone

Integer

0

Yes

Timestamp

Long

0

Yes

Process Create (<processCreate>)

Attribute

Type

Default value

Used

ID

Short

1002

Yes

Length

Integer

N/a

Yes

Process ID

String

NULL

Yes

Name

String

NULL

No

PID

Integer

0

Yes

Node ID Ref

String

NULL

Yes

Timestamp

Long

0

Yes

Executable

String

NULL

No

Agent Create (<agentCreate>)

Attribute

Type

Default value

Used

ID

Short

1003

Yes

Length

Integer

N/a

Yes

Agent ID

String

NULL

Yes

Process ID

String

NULL

Yes

Name

String

NULL

Yes

Type

String

NULL

Yes

Timestamp

Long

0

Yes

Parameters

String

NULL

Yes

Version

String

NULL

Yes

Agent Destroy (<agentDestroy>)

Attribute

Type

Default value

Used

ID

Short

1004

Yes

Length

Integer

N/a

Yes

Agent ID

String

NULL

Yes

Timestamp

Long

0

Yes

Trace start (<traceStart>)

Attribute

Type

Default value

Used

ID

Short

1005

Yes

Length

Integer

N/a

Yes

Trace ID

String

NULL

Yes

Agent ID

String

NULL

Yes

Timestamp

Long

0

Yes

Collation value

String

NULL

No

Trace end (<traceEnd>)

Attribute

Type

Default value

Used

ID

Short

1006

Yes

Length

Integer

N/a

Yes

Trace ID

String

NULL

No

Timestamp

Long

0

Yes

Collation value

String

NULL

No

Filter (<filter>)

Attribute

Type

Default value

Used

ID

Short

1007

Yes

Length

Integer

N/a

Yes

Trace ID

String

NULL

Yes

Pattern

String

NULL

Yes

Generic pattern

String

NULL

Yes

Mode

String

NULL

Yes

Method pattern

String

NULL

Yes

Method generic pattern

String

NULL

Yes

Method mode

String

NULL

Yes

Option (<option>)

Attribute

Type

Default value

Used

ID

Short

1008

Yes

Length

Integer

N/a

Yes

Trace ID

String

NULL

No

Key

String

NULL

Yes

Value

String

NULL

Yes

Thread start (<threadStart>)

Attribute

Type

Default value

Used

ID

Short

1009

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Group name

String

NULL

Yes

Parent group name

String

NULL

Yes

Transient object ID

Long

0

No

Object ID

Long

0

Yes

Thread name

String

NULL

Yes

Collation value

String

NULL

No

Trace ID

String

NULL

No

Thread end (<threadEnd>)

Attribute

Type

Default value

Used

ID

Short

1010

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Collation value

String

NULL

No

Trace ID

String

NULL

No

Class definition (<classDef>)

Attribute

Type

Default value

Used

ID

Short

1011

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

No

Timestamp

Long

0

Yes

Number of interfaces

Integer

0

No

Interface names

String

NULL

No

Transient class ID

Long

0

No

Class ID

Long

0

Yes

Source name

String

NULL

Yes

Class loader

String

NULL

No

Super class

String

NULL

No

Transient object ID

Long

0

No

Object ID

Long

0

No

Name

String

NULL

Yes

Access

String

NULL

No

Number of static fields

Integer

0

No

Number of methods

Integer

0

No

Number of instance fields

Integer

0

No

Collation value

String

NULL

No

Trace ID

String

NULL

No

Method definition (<methodDef>)

Attribute

Type

Default value

Used

ID

Short

1012

Yes

Length

Integer

N/a

Yes

Name

String

NULL

Yes

Signature

String

NULL

No

Is native

Byte

0

No

Is abstract

Byte

0

No

Is static

Byte

0

No

Is synchronized

Byte

0

No

Exceptions

String

NULL

No

Start line

Long

0

Yes

End line

Long

0

Yes

Signature notation

String

NULL

No

Transient class ID

Long

0

No

Class ID

Long

0

Yes

Method ID

Long

0

Yes

Collation value

String

NULL

No

Trace ID

String

NULL

No

Object allocation (<objAlloc>)

Attribute

Type

Default value

Used

ID

Short

1014

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Is array

Byte

0

Yes

Transient object ID

Long

0

No

Object ID

Long

0

Yes

Size

Long

0

Yes

Line

Long

0

Yes

Method ID

Long

0

Yes

Transient class ID

Long

0

No

Class ID

Long

0

Yes

Context data

String

NULL

No

Collation value

String

NULL

No

Trace ID

String

NULL

No

Method entry (<methodEntry>)

Attribute

Type

Default value

Used

ID

Short

1015

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Method ID

Long

0

Yes

Ticket

Integer

0

Yes

Transient object ID

Long

0

No

Class ID

Long

0

Yes

Thread CPU time

Long

0

No

Sequence counter

Long

0

No

Stack depth

Long

0

Yes

Collation value

String

NULL

No

Trace ID

String

NULL

No

Method exit (<methodExit>)

Attribute

Type

Default value

Used

ID

Short

1016

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Ticket

Integer

0

Yes

Thread CPU time

Long

0

Yes

Method ID

Long

0

No

Transient object ID

Long

0

No

Object ID

Long

0

No

Transient class ID

Long

0

Yes

Class ID

Long

0

No

Sequence counter

String

NULL

No

Collation value

String

NULL

No

Trace ID

String

NULL

No

GC start (<gcStart>)

Attribute

Type

Default value

Used

ID

Short

1023

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

No

Timestamp

Long

0

Yes

Collation value

String

NULL

No

Trace ID

String

NULL

No

Object free (<objFree>)

Attribute

Type

Default value

Used

ID

Short

1024

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

No

Timestamp

Long

0

Yes

Transient object ID

Long

0

No

Object ID

Long

0

Yes

Object age

Long

0

Yes

Sequence counter

Long

0

No

Stack depth

Long

0

No

Collation value

String

NULL

No

Trace ID

String

NULL

No

GC finish (<gcFinish>)

Attribute

Type

Default value

Used

ID

Short

1027

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

No

Timestamp

Long

0

Yes

Total object space

Long

0

No

Used object space

Long

0

No

Used object

Long

0

No

Collation value

String

NULL

No

Trace ID

String

NULL

No

Runtime init done (<runtimeInitDone>)

Attribute

Type

Default value

Used

ID

Short

1030

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Collation value

String

NULL

No

Trace ID

String

NULL

No

Runtime shutdown (<runtimeShutdown>)

Attribute

Type

Default value

Used

ID

Short

1031

Yes

Length

Integer

N/a

Yes

Transient thread ID

Long

0

No

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Collation value

String

NULL

No

Trace ID

String

NULL

No

Monitor wait (<monWait>)

Attribute

Type

Default value

Used

ID

Short

1032

Yes

Length

Integer

N/a

Yes

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Object ID

Long

0

Yes

Timeout

Long

0

Yes

Stack depth *

Integer

0

Yes

Stack methods *

String[]

0

Yes

Stack lines *

Long[]

0

Yes

* - fields specific to binary format which replace annotations

Monitor waited (<monWaited>)

Attribute

Type

Default value

Used

ID

Short

1033

Yes

Length

Integer

N/a

Yes

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Object ID

Long

0

Yes

Timeout

Long

0

Yes

Stack depth *

Integer

0

Yes

Stack methods *

String[]

0

Yes

Stack lines *

Long[]

0

Yes

* - fields specific to binary format which replace annotations

Monitor contended enter (<monContendedEnter>)

Attribute

Type

Default value

Used

ID

Short

1034

Yes

Length

Integer

N/a

Yes

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Object ID

Long

0

Yes

Thread owner ID

Long

0

Yes

Stack depth *

Integer

0

Yes

Stack methods *

String[]

0

Yes

Stack lines *

Long[]

0

Yes

* - fields specific to binary format which replace annotations

Monitor contended entered (<monContendedEntered>)

Attribute

Type

Default value

Used

ID

Short

1035

Yes

Length

Integer

N/a

Yes

Thread ID

Long

0

Yes

Timestamp

Long

0

Yes

Object ID

Long

0

Yes

Stack depth *

Integer

0

Yes

Stack methods *

String[]

0

Yes

Stack lines *

Long[]

0

Yes

* - fields specific to binary format which replace annotations

Aggregate method entry (<agMethodEntry>)

Attribute

Type

Default value

Used

ID

Short

1036

Yes

Length

Integer

N/a

Yes

Thread ID

Long

0

Yes

Method ID

Long

0

Yes

Base time

Long

0

Yes

Minimum time

Long

0

Yes

Maximum time

Long

0

Yes

Base CPU time

Long

0

Yes

Number of calls

Long

0

Yes

Aggregate method exit (<agMethodExit>)

Attribute

Type

Default value

Used

ID

Short

1037

Yes

Length

Integer

N/a

Yes

Thread ID

Long

0

Yes

Method ID

Long

0

Yes

Heap dump start (<hdStart>)

Attribute

Type

Default value

Used

ID

Short

1038

Yes

Length

Integer

N/a

Yes

Heap dump ID

Long

0

Yes

Timestamp

Long

0

Yes

Name

String

NULL

Yes

Base time

Long

0

Yes

GC root (<gcRoot>)

Attribute

Type

Default value

Used

ID

Short

1039

Yes

Length

Integer

N/a

Yes

Class ID

Long

0

Yes

Object ID

Long

0

Yes

Type

String

NULL

Yes

Object reference (<objRef>)

Attribute

Type

Default value

Used

ID

Short

1040

Yes

Length

Integer

N/a

Yes

Source ID

Long

0

Yes

Target ID

Long

0

Yes

Heap dump ID

Long

0

Yes

Custom (<custom>)

Attribute

Type

Default value

Used

ID

Short

1041

Yes

Length

Integer

N/a

Yes

Custom body

String

NULL

Yes