{"id":1623,"date":"2016-03-12T20:48:53","date_gmt":"2016-03-12T18:48:53","guid":{"rendered":"https:\/\/alax.info\/blog\/?p=1623"},"modified":"2016-03-12T20:48:53","modified_gmt":"2016-03-12T18:48:53","slug":"calling-convention-violator-broke-streaming-loop-pretty-far-away","status":"publish","type":"post","link":"https:\/\/alax.info\/blog\/1623","title":{"rendered":"Calling convention violator broke streaming loop pretty far away"},"content":{"rendered":"<p>A really nasty problem coming from MainConcept AVC\/H.264 SDK Encoder was destroying media streaming pipeline. SDK is somewhat old (9.7.9.5738) and the problem might be already fixed, or might be not. The problem is a good example of how a small bug could become a big pain.<\/p>\n<p>The problem was coming up in 64-bit Release builds only. Win32 build? OK. Debug build where you can step things through? No problem.<\/p>\n<p>The bug materialized in <a href=\"https:\/\/github.com\/roman380\/gdcl.co.uk-mpeg4\">GDCL MP4 Demultiplexer<\/a> filter streaming (Demultiplexer filter in the pipeline below) generating media samples with incorrect time stamps.<\/p>\n<p><a href=\"https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd2.png\" rel=\"attachment wp-att-1624\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-1624\" src=\"https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd2.png\" alt=\"Pipeline\" width=\"713\" height=\"410\" srcset=\"https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd2.png 713w, https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd2-320x184.png 320w\" sizes=\"auto, (max-width: 713px) 100vw, 713px\" \/><\/a><\/p>\n<p>Initial start and stop time are okay, and further go as <code>_I64_MIN<\/code> (incorrect).<\/p>\n<p><a href=\"https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd3.png\" rel=\"attachment wp-att-1625\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-1625\" src=\"https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd3-800x394.png\" alt=\"Clipbrd3\" width=\"648\" height=\"319\" srcset=\"https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd3-800x394.png 800w, https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd3-320x158.png 320w, https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd3-768x378.png 768w, https:\/\/alax.info\/blog\/wp-content\/uploads\/2016\/03\/Clipbrd3.png 1186w\" sizes=\"auto, (max-width: 648px) 100vw, 648px\" \/><\/a><\/p>\n<p>The problem appears to be SSE optimization and x64 calling convention related. This explains why it&#8217;s only 64-bit Release build suffering from the issue. MS compiler decided to use <code>XMM7<\/code> register for <code>dRate<\/code> variable in <a href=\"https:\/\/github.com\/roman380\/gdcl.co.uk-mpeg4\/blob\/master\/mp4demux\/DemuxFilter.cpp#L812\">this code fragment<\/a>:<\/p>\n<pre><code>REFERENCE_TIME tStart, tStop;\r\ndouble dRate;\r\nm_pParser-&gt;GetSeekingParams(&amp;tStart, &amp;tStop, &amp;dRate);\r\n\r\n[...]\r\n\r\nfor(; ; )\r\n{\r\n    [...]\r\n\r\n    tSampleStart = REFERENCE_TIME(tSampleStart \/ dRate);\r\n    tSampleEnd = REFERENCE_TIME(tSampleEnd \/ dRate);\r\n<\/code><\/pre>\n<p><code>dRate<\/code> is the only floating point thing here and it&#8217;s clear why the compiler optimized the variable into register: no other floating point activity around.<\/p>\n<p>However sample delivery goes pretty deep into other functions and modules reaching MainConcept H.264 encoder. One of its functions is violating <a href=\"http:\/\/stackoverflow.com\/a\/262328\/868014\">x64 calling convention<\/a> and does not preserve XMM6+ register values. OOPS! Everything is about working right, but after media sample delivery <code>dRate<\/code> value is destroyed and further media samples receive incorrect time stamps.<\/p>\n<p>It is not really a problem of MP4 demultiplexer, of course, however media sample delivery might involve a long delivery chain where any violator would break streaming loop. In the same time, it is not really a big expense to de-optimize the floating point math in the demultiplexer for those a few time stamp adjustment operations. A <code>volatile<\/code> specifier breaks compiler optimization and makes the loop resistant to SSE2 register violators:<\/p>\n<pre><code>\/\/ HOTFIX: Volatile specifier is not really necessary here but it fixes a nasty problem with MainConcept AVC SDK violating x64 calling convention;\r\n\/\/         MS compiler might choose to keep dRate in XMM6 register and the value would be destroyed by the violating call leading to incorrect \r\n\/\/         further streaming (wrong time stamps)\r\nvolatile DOUBLE dRate;\r\nm_pParser-&gt;GetSeekingParams(&amp;tStart, &amp;tStop, (DOUBLE*) &amp;dRate);<\/code><\/pre>\n<p>This makes H.264 this build of encoding SDK unstable and the problem is hopefully already fixed. The SDK indeed gave other troubles on specific architectures leading to undefined behavior.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A really nasty problem coming from MainConcept AVC\/H.264 SDK Encoder was destroying media streaming pipeline. SDK is somewhat old (9.7.9.5738) and the problem might be already fixed, or might be not. The problem is a good example of how a small bug could become a big pain. The problem was coming up in 64-bit Release&hellip; <\/p>\n<p><a class=\"moretag\" href=\"https:\/\/alax.info\/blog\/1623\">Read the full article<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[63,78,269,379,503,486],"class_list":["post-1623","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-bug","tag-directshow","tag-fix","tag-h-264","tag-mainconcept","tag-video"],"_links":{"self":[{"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/posts\/1623","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/comments?post=1623"}],"version-history":[{"count":0,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/posts\/1623\/revisions"}],"wp:attachment":[{"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/media?parent=1623"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/categories?post=1623"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/tags?post=1623"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}