{
  "epoch": [
    1,
    2,
    3,
    4,
    5,
    6,
    7,
    8,
    9,
    10,
    11,
    12,
    13,
    14,
    15,
    16,
    17,
    18,
    19,
    20,
    21,
    22,
    23,
    24,
    25,
    26,
    27,
    28,
    29,
    30,
    31,
    32,
    33,
    34,
    35,
    36,
    37,
    38,
    39,
    40,
    41,
    42,
    43,
    44,
    45,
    46,
    47,
    48,
    49,
    50,
    51,
    52,
    53,
    54,
    55,
    56,
    57,
    58,
    59,
    60,
    61,
    62,
    63,
    64,
    65,
    66,
    67,
    68,
    69,
    70,
    71,
    72,
    73,
    74,
    75,
    76,
    77,
    78,
    79,
    80,
    81,
    82,
    83,
    84,
    85,
    86,
    87,
    88,
    89,
    90,
    91,
    92,
    93,
    94,
    95,
    96,
    97,
    98,
    99,
    100,
    101,
    102,
    103,
    104,
    105,
    106,
    107,
    108,
    109,
    110,
    111,
    112,
    113,
    114,
    115,
    116,
    117,
    118,
    119,
    120,
    121,
    122,
    123,
    124,
    125,
    126,
    127,
    128,
    129,
    130,
    131,
    132,
    133,
    134,
    135,
    136,
    137,
    138,
    139,
    140,
    141,
    142,
    143,
    144,
    145,
    146,
    147,
    148,
    149,
    150,
    151,
    152,
    153,
    154,
    155,
    156,
    157,
    158,
    159,
    160,
    161,
    162,
    163,
    164,
    165,
    166,
    167,
    168,
    169,
    170,
    171,
    172,
    173,
    174,
    175,
    176,
    177,
    178,
    179,
    180,
    181,
    182,
    183,
    184,
    185,
    186,
    187,
    188,
    189,
    190,
    191,
    192,
    193,
    194,
    195,
    196,
    197,
    198,
    199,
    200,
    201,
    202,
    203,
    204,
    205,
    206,
    207,
    208,
    209,
    210,
    211,
    212,
    213,
    214,
    215,
    216,
    217,
    218,
    219,
    220,
    221,
    222,
    223,
    224,
    225,
    226,
    227,
    228,
    229,
    230,
    231,
    232,
    233,
    234,
    235,
    236,
    237,
    238,
    239,
    240,
    241,
    242,
    243,
    244,
    245,
    246,
    247,
    248,
    249,
    250,
    251,
    252,
    253,
    254,
    255,
    256,
    257,
    258,
    259,
    260
  ],
  "train_loss": [
    2.1653225806451615,
    2.0854334677419355,
    2.0246975806451615,
    1.951108870967742,
    1.8608870967741935,
    1.7550403225806452,
    1.6461693548387097,
    1.5372983870967742,
    1.4372479838709677,
    1.3402217741935485,
    1.2484879032258065,
    1.1568800403225807,
    1.0779989919354838,
    0.9987399193548387,
    0.924773185483871,
    0.858366935483871,
    0.7876764112903226,
    0.7259324596774194,
    0.6619203629032258,
    0.6018145161290323,
    0.5466229838709677,
    0.4959047379032258,
    0.4500378024193548,
    0.40139868951612906,
    0.3580519153225806,
    0.31930443548387094,
    0.28537676411290325,
    0.25708795362903225,
    0.22826360887096775,
    0.20567666330645162,
    0.18623991935483872,
    0.16617313508064516,
    0.15004410282258066,
    0.1359154485887097,
    0.12268460181451613,
    0.1139742943548387,
    0.1065398185483871,
    0.10102696572580645,
    0.09627016129032258,
    0.09108807963709678,
    0.0859375,
    0.08147208921370967,
    0.07776272681451613,
    0.07481728830645161,
    0.07234438004032258,
    0.06984784526209678,
    0.06800497731854839,
    0.06667401713709678,
    0.06599672379032258,
    0.06434286794354839,
    0.06181483114919355,
    0.060286983366935484,
    0.05977507560483871,
    0.05903477822580645,
    0.05864100302419355,
    0.05852287046370968,
    0.05717615927419355,
    0.05699502268145161,
    0.05702652469758065,
    0.05852287046370968,
    0.06005071824596774,
    0.057837701612903226,
    0.05583732358870968,
    0.053616431451612906,
    0.05241935483870968,
    0.05113564768145161,
    0.05025359122983871,
    0.04932428175403226,
    0.047835811491935484,
    0.04726089969758065,
    0.04609532510080645,
    0.04658360635080645,
    0.04675686743951613,
    0.04780430947580645,
    0.048394972278225805,
    0.050064579133064516,
    0.05089150705645161,
    0.05167905745967742,
    0.05256898941532258,
    0.055648311491935484,
    0.07014711441532258,
    0.10475995463709678,
    0.0823147681451613,
    0.06492565524193548,
    0.052718623991935484,
    0.04639459425403226,
    0.042740360383064516,
    0.040015435987903226,
    0.03851121471774194,
    0.037621282762096774,
    0.03703849546370968,
    0.03668409778225806,
    0.03626669606854839,
    0.03593592489919355,
    0.03561302923387097,
    0.035455519153225805,
    0.03516412550403226,
    0.034998739919354836,
    0.03478610131048387,
    0.03466796875,
    0.034518334173387094,
    0.034518334173387094,
    0.034408077116935484,
    0.03432144657258065,
    0.03410093245967742,
    0.03410880796370968,
    0.03424269153225806,
    0.03503024193548387,
    0.03620369203629032,
    0.037511025705645164,
    0.03624306955645161,
    0.036392704133064516,
    0.03788117439516129,
    0.04195280997983871,
    0.05383694556451613,
    0.08255103326612903,
    0.1022555443548387,
    0.08774886592741936,
    0.06718592489919355,
    0.05193894909274194,
    0.04377205141129032,
    0.03965316280241935,
    0.037684286794354836,
    0.03625882056451613,
    0.03536101310483871,
    0.03495148689516129,
    0.03471522177419355,
    0.03450258316532258,
    0.03422694052419355,
    0.03421118951612903,
    0.03406943044354839,
    0.033841040826612906,
    0.03365202872983871,
    0.033636277721774195,
    0.03358902469758065,
    0.03354964717741935,
    0.03357327368951613,
    0.03358902469758065,
    0.033604775705645164,
    0.033778036794354836,
    0.03389616935483871,
    0.03387254284274194,
    0.03369928175403226,
    0.034581338205645164,
    0.033990675403225805,
    0.03397492439516129,
    0.03428994455645161,
    0.03507749495967742,
    0.036865234375,
    0.04227570564516129,
    0.05994046118951613,
    0.08571698588709678,
    0.08735509072580645,
    0.0731476814516129,
    0.05660912298387097,
    0.045811806955645164,
    0.04033045614919355,
    0.03760553175403226,
    0.036077683971774195,
    0.03525863155241935,
    0.03484910534274194,
    0.03421906502016129,
    0.03388829385080645,
    0.033667779737903226,
    0.033447265625,
    0.03317162298387097,
    0.03306924143145161,
    0.03299836189516129,
    0.03285660282258065,
    0.032801474294354836,
    0.032730594758064516,
    0.03267546622983871,
    0.032608524445564516,
    0.03258883568548387,
    0.032569146925403226,
    0.03248251638104839,
    0.03246282762096774,
    0.03243920110887097,
    0.03238801033266129,
    0.032325006300403226,
    0.032242313508064516,
    0.032242313508064516,
    0.032202935987903226,
    0.032159620715725805,
    0.03216355846774194,
    0.03215174521169355,
    0.03205723916330645,
    0.03215174521169355,
    0.03266759072580645,
    0.034447454637096774,
    0.04132276965725806,
    0.06315366683467742,
    0.09197013608870967,
    0.08710307459677419,
    0.06614635836693548,
    0.05098601310483871,
    0.04316563760080645,
    0.038385206653225805,
    0.03581779233870968,
    0.03473097278225806,
    0.03413243447580645,
    0.03369928175403226,
    0.03326612903225806,
    0.03302986391129032,
    0.03278572328629032,
    0.03272271925403226,
    0.03257308467741935,
    0.032486454133064516,
    0.032376197076612906,
    0.03236044606854839,
    0.032254126764112906,
    0.03217537172379032,
    0.03215174521169355,
    0.032088741179435484,
    0.0321044921875,
    0.03210055443548387,
    0.03195091985887097,
    0.031966670866935484,
    0.031883978074596774,
    0.0318603515625,
    0.03190366683467742,
    0.031836725050403226,
    0.03180916078629032,
    0.03186822706653226,
    0.03178553427419355,
    0.03174615675403226,
    0.03170677923387097,
    0.031683152721774195,
    0.03166346396169355,
    0.03173040574596774,
    0.031643775201612906,
    0.03162408644153226,
    0.03163589969758065,
    0.031592584425403226,
    0.03159652217741935,
    0.03156895791330645,
    0.03156895791330645,
    0.031588646673387094,
    0.031592584425403226,
    0.03157289566532258,
    0.031561082409274195,
    0.031529580393145164,
    0.03148626512096774,
    0.03274634576612903,
    0.03498298891129032,
    0.03531769783266129,
    0.04793031754032258,
    0.10855594758064516,
    0.12054246471774194,
    0.0784085181451613,
    0.05210433467741935,
    0.04178742439516129,
    0.03704637096774194,
    0.03498298891129032,
    0.03392767137096774,
    0.03313224546370968,
    0.03284085181451613,
    0.03253370715725806,
    0.03232894405241935,
    0.032293504284274195
  ],
  "learning_rate": [
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05,
    8e-05
  ],
  "grad_norm": [
    0.12508780217388385,
    0.10589783757127305,
    0.15174462650490367,
    0.23683663907213262,
    0.35566839408929857,
    0.500785469173796,
    0.676177252371645,
    0.820281552223095,
    0.9877766315923661,
    1.1099611766029238,
    1.254318802278963,
    1.364250113698242,
    1.4436021767148006,
    1.566250730888701,
    1.5917735750262694,
    1.8365009077222336,
    1.8082747480241472,
    1.960557358156948,
    2.011126821588749,
    1.9903028629712782,
    2.0304202495241435,
    2.0948016419807995,
    2.0415717191822935,
    2.0306125749849815,
    2.0013712901288607,
    1.9747866425509277,
    1.9770930304663026,
    2.018837844342008,
    1.858041285597586,
    1.8514059382285057,
    1.7670133956584753,
    1.7204667481843314,
    1.6285822748157524,
    1.4840076232497716,
    1.4136498202792631,
    1.394516990314487,
    1.3890230479439165,
    1.3666539629845869,
    1.3582045012042616,
    1.302809449404967,
    1.2470303656673147,
    1.206048270409138,
    1.169783548432182,
    1.1699825935849726,
    1.1393643410666272,
    1.1085912307691939,
    1.106248330327584,
    1.0994294745188635,
    1.1365313849018153,
    1.0987073075800775,
    1.0232115868853937,
    1.0189862421905729,
    1.0417421063869357,
    1.0502795534767813,
    1.0343517222340812,
    1.0169984250393713,
    1.007694715953276,
    1.0466050442515669,
    1.07883493147851,
    1.127494352491545,
    1.204007737943031,
    1.1195053611083172,
    1.0401040763601026,
    0.9472522592613358,
    0.9784527016528088,
    0.9103317092510647,
    0.8773078204859928,
    0.8444041604645812,
    0.8011969905457145,
    0.768146890774296,
    0.7376196461698858,
    0.7925945805487759,
    0.7865666324098216,
    0.8789447678838743,
    0.8988735265321858,
    0.9815188753450448,
    1.027773544417018,
    1.0560326379622964,
    1.1260931924880098,
    1.2258310677602513,
    1.9957681507403364,
    2.240855753649523,
    1.680556628029025,
    1.2918283645345892,
    0.8967097419303758,
    0.6508233581435853,
    0.4776292534977653,
    0.3208847211858974,
    0.27451306876598575,
    0.23085643631093158,
    0.21273489606900603,
    0.2024740380438469,
    0.19664074613963597,
    0.18924849685307485,
    0.1778400786254458,
    0.17993360182018506,
    0.17357794138986954,
    0.17071165260749707,
    0.168177485254929,
    0.16568644912089917,
    0.16617763024876503,
    0.1629179752217632,
    0.1596342881603384,
    0.1614259722984962,
    0.15579894502611003,
    0.16386143126420788,
    0.18860747292442037,
    0.2760594883259369,
    0.2988970563883983,
    0.3067668169537917,
    0.2951965424793687,
    0.35979850313613204,
    0.5550166776775091,
    0.8452513492357019,
    1.332258910472989,
    1.9900154188467931,
    2.187639917185989,
    1.740716116319444,
    1.2776139713040147,
    0.8935901427061529,
    0.5997878358938543,
    0.4060067910189308,
    0.2931059668498743,
    0.21428592636063803,
    0.17150477712290296,
    0.1626039398219169,
    0.16109386149943067,
    0.15879486422862293,
    0.1618787906960178,
    0.1686287260533333,
    0.15593488775644235,
    0.14988426061686613,
    0.15114853711175655,
    0.14903979781339813,
    0.14733812989942743,
    0.15815783134910208,
    0.16181177024811266,
    0.1576237876823061,
    0.1765343305217553,
    0.1904642882724223,
    0.19342688111695228,
    0.1952366246552799,
    0.217648773522509,
    0.25463016022812474,
    0.2090953898310137,
    0.2221349114452695,
    0.26768796923675103,
    0.3799666977022162,
    0.5152483704889042,
    0.8658725256642057,
    1.4202908750582315,
    1.8027172324696834,
    1.7026456226210214,
    1.46212622999752,
    1.0191369128148673,
    0.7010315859193706,
    0.4970560749108091,
    0.34700873516518127,
    0.2906881573177495,
    0.22215259384081867,
    0.20439044549822682,
    0.17593801477643584,
    0.16897083035591,
    0.14682422726675604,
    0.1387855521125188,
    0.12815660223112435,
    0.12758539543466113,
    0.12620912018988245,
    0.12406743585428506,
    0.12385791481349159,
    0.12183548296813076,
    0.12135149016798379,
    0.12018968126874775,
    0.11857864572861028,
    0.11928146912143647,
    0.11891698995748405,
    0.12194840349161211,
    0.11572103121805923,
    0.1147710585989077,
    0.11307539241134024,
    0.10960109229699286,
    0.11282158497675064,
    0.11486776767760827,
    0.10937919129379944,
    0.10790435739184402,
    0.10723043150820541,
    0.1066234342464061,
    0.14305521961296497,
    0.22007284564799118,
    0.42892609660474335,
    0.8136513939652764,
    1.383867791289469,
    1.7453053000777863,
    1.559557069112772,
    1.1709673727542507,
    0.8406092619266168,
    0.5956486901323502,
    0.4008590460131189,
    0.2827874397695421,
    0.21693312920377164,
    0.19988915878499777,
    0.1554682899665758,
    0.14147582036747405,
    0.12240852889806565,
    0.11459912573597573,
    0.1137380589675554,
    0.1123398077467492,
    0.11056993566172314,
    0.1090066169626789,
    0.10925283810498068,
    0.10739698991389746,
    0.10526350041095493,
    0.10988409793304758,
    0.10451016947480374,
    0.10696834446658963,
    0.10304999223103563,
    0.101509819375359,
    0.10244245581995005,
    0.10052658535851329,
    0.10249588671568283,
    0.09924379373224763,
    0.09969166502384841,
    0.09726461699603904,
    0.09813925620552244,
    0.09614115849584869,
    0.09675057994027957,
    0.09604288492704285,
    0.09428955585751228,
    0.09515621883595413,
    0.0937050829363453,
    0.09343825561009222,
    0.0934059522431494,
    0.09275278553777808,
    0.0905985105440785,
    0.09017987122593256,
    0.09113838312614468,
    0.08940271238377202,
    0.09081929359862159,
    0.09398552051507605,
    0.088616611149146,
    0.09122026489313828,
    0.08778220698082154,
    0.08760504888164802,
    0.19144664897748734,
    0.23562288000537085,
    0.33802956825106717,
    0.9571112051286975,
    1.8856450976317443,
    1.7691664618949858,
    1.1934521991894573,
    0.7724940310796032,
    0.5261167768819515,
    0.32464321221285536,
    0.242082761519927,
    0.17504462184544067,
    0.12730742033677722,
    0.11274418687517623,
    0.10213256466394367,
    0.10269633156968092,
    0.09767058335289237
  ],
  "validation": [
    {
      "train_loss": null,
      "update_counter": 0,
      "epoch": 0.0,
      "val_loss": 2.171875,
      "val_entropy": 2.12841796875,
      "val_ttr": 0.3108723958333333,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the results of the comparison of the whole timeline configuration.\n\n\\begin{figure}[H]\n\\centering\n\\includegraphics[width=0.8\\textwidth]{figures/comparison_whole_timeline_configuration_1.png}\n\\caption{Comparison of the whole timeline configuration}\n\\label{comparison_whole_timeline_configuration_1}\n\\end{figure}\n\n\\begin{figure}[H]\n\\centering\n\\includegraphics[width=0.8\\textwidth]{figures/comparison_whole_timeline_configuration_2.png}\n\\caption{Comparison of the whole timeline configuration}\n\\label{comparison_whole_timeline_configuration_2}\n\\end{figure}\n\n\\subsection{Comparison of the whole timeline configuration}\n\nIn the first figure we can see that the whole timeline configuration is the best configuration in terms of the number of nodes and the number of edges.\n\nIn the second figure we can see that the whole timeline configuration is the best configuration in terms of the number of nodes and the number of edges.\n"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " with equal time step size.\n\n\\begin{table}[h]\n\\centering\n\\begin{tabular}{c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ".\n\nThe first problem is a simple fluid-structure interaction problem. The fluid is a Newtonian fluid with a constant viscosity. The solid is a rigid body with a constant density. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the solid is incompressible. The fluid is incompressible and the"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10 m.\n\nThe localization error is the difference between the actual position of the robot and the position estimated by the localization algorithm.\n\nThe localization error is the difference between the actual position of the robot and the position estimated by the localization algorithm.\n\nThe localization error is the difference between the actual position of the robot and the position estimated by the localization algorithm.\n\nThe localization error is the difference between the actual position of the robot and the position estimated by the localization algorithm.\n\nThe localization error is the difference between the actual position of the robot and the position estimated by the localization algorithm.\n\nThe localization error is the difference between the actual position of the robot and the position estimated by the localization algorithm.\n\nThe localization error is the difference between the actual position of the robot and the position estimated by the localization algorithm.\n\nThe localization error is the difference between the actual position of the robot and the position estimated by the localization algorithm.\n\nThe localization error is the difference between the actual position of the robot and the position estimated by the localization algorithm.\n\nThe localization"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "xua} it was shown that the CME is not a solution of the original CME equation. The reason is that the CME is not a solution of the original CME equation because the CME is not a solution of the original CME equation. The reason is that the CME is not a solution of the original CME equation because the CME is not a solution of the original CME equation. The reason is that the CME is not a solution of the original CME equation because the CME is not a solution of the original CME equation. The reason is that the CME is not a solution of the original CME equation because the CME is not a solution of the original CME equation. The reason is that the CME is not a solution of the original CME equation because the CME is not a solution of the original CME equation. The reason is that the CME is not a solution of the original CME equation because the CME is not a solution of the original CME equation. The reason is that the CME is not a solution of the original CME equation because the CME is not a solution of the original CME equation. The reason is that the CME is not a"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2010,Kaltenbacher2011,Kaltenbacher2012,Kaltenbacher2013,Kaltenbacher2014,Kaltenbacher2015,Kaltenbacher2016,Kaltenbacher2017,Kaltenbacher2018,Kaltenbacher2019,Kaltenbacher2020,Kaltenbacher2021,Kaltenbacher2022,Kaltenbacher2023,Kaltenbacher2024,Kaltenbacher2025,Kaltenbacher2026,Kaltenbacher2027,Kaltenbacher2028,Kaltenbacher2029,Kaltenbacher2030,Kaltenbacher2031,Kaltenbacher2032,Kaltenbacher2033,Kaltenbacher2034,Kal"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " of the local minima of the objective function.\n    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates of the local minima of the objective function.\n    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates of the local minima of the objective function.\n    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates of the local minima of the objective function.\n    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates of the local minima of the objective function.\n    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates of the local minima of the objective function.\n    \\item \\emph{The spurious first"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the sequence of iterates to a local minimizer.\n    \\item \\emph{No convergence in the case of non-convexity.} The convergence theory in the convex case is not applicable to the non-convex case.\n    \\item \\emph{No convergence in the case of non-differentiability.} The convergence theory in the convex case is not applicable to the non-differentiable case.\n    \\item \\emph{No convergence in the case of non-smoothness.} The convergence theory in the convex case is not applicable to the non-smooth case.\n    \\item \\emph{No convergence in the case of non-Lipschitz continuity.} The convergence theory in the convex case is not applicable to the non-Lipschitz case.\n    \\item \\emph{No convergence in the case of non-smoothness.} The convergence theory in the convex case is not applicable to the non-smooth case.\n    \\item \\emph{No convergence in the case of non-differentiability.} The convergence theory in the convex case is not applicable to the non-differentiable case.\n   "
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the boundary of $\\Omega$ (e.g. $g_j = \\partial \\Omega$).\nI am trying to solve the following problem:\n$$\n\\begin{cases}\n\\nabla \\cdot (a(x)\\nabla u) = f(x) & \\text{in } \\Omega \\\\\nu = g_j & \\text{on } \\partial \\Omega\n\\end{cases}\n$$\nwhere $a(x)$ is a positive definite matrix.\nI am trying to solve this problem using the finite element method. I have a mesh $\\mathcal{T}_h$ and a set of basis functions $v_i$. I am trying to find a function $u_h$ such that\n$$\nu_h = \\sum_{i=1}^N \\alpha_i v_i\n$$\nwhere $\\alpha_i$ are the unknown coefficients.\nI am using the following formulation:\n$$\n\\begin{cases}\n\\int_{\\Omega} a(x) \\nabla u_h \\cdot \\nabla"
        }
      ]
    },
    {
      "train_loss": 1.8745,
      "update_counter": 250,
      "epoch": 8.03225806451613,
      "val_loss": 2.4013671875,
      "val_entropy": 1.78662109375,
      "val_ttr": 0.361328125,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the performance of the different configurations for the whole timeline. The results are averaged over 1000 runs. The first row shows the results for the configuration with the fixed window size, where the window size is set to $W=1000$ and $W=2000$. The second row shows the results for the configuration with the adaptive window size, where the window size is set to $W=1000$ and $W=2000$. The third row shows the results for the configuration with the fixed window size, where the window size is set to $W=1000$ and $W=2000$. The fourth row shows the results for the configuration with the adaptive window size, where the window size is set to $W=1000$ and $W=2000$. The last row shows the results for the configuration with the adaptive window size, where the window size is set to $W=1000$ and $W=2000$. In the first column, we"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for the two-dimensional problem with $A=\\Delta$ and $b=0$. The solution is the first eigenmode of the problem. We see that the a posteriori error estimator is not consistent in the sense of~\\eqref{eq:ap_estimator_consistency} in this case. The reason is that the uniform time meshes are not compatible with the uniform space meshes. The uniform space meshes are not compatible with the uniform time meshes in general. The a posteriori error estimator is not consistent in the sense of~\\eqref{eq:ap_estimator_consistency} in the case of the mixed finite element method for the Poisson problem with $A=\\Delta$ and $b=0$ with the space discretization of order $p$ and time discretization of order $q$ with $p<q$ in two dimensions. The reason is that the time discretization is not compatible with the space discretization in this case. The time discretization is not compatible with the space discretization in general. The a posteriori error estimator is not consistent in the sense of~\\eqref{eq:ap_estimator_consistency} in"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In this work, we use a two-step approach, where the solid is solved in a first step, followed by a second step for the fluid. The first step is a standard Newton-Krylov method for the solid, which is adapted to the problem at hand. The second step is a multirate time-stepping scheme for the fluid, which is adapted to the problem at hand. The overall scheme is shown in Figure \\ref{fig:adaptivity}. The first step is performed for the solid, where the Newton-Krylov method is adapted to the problem at hand. The Newton-Krylov method is a standard Newton-Krylov method, where the Newton iteration is performed using the Newton-Krylov method. The Newton-Krylov method is adapted to the problem at hand, where the adaptation is performed by using the residuals of the Newton-Krylov method. The residuals are computed using the residuals of the Newton-Krylov method. The residuals are computed using the residuals of the Newton-Krylov method. The residuals are computed"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% in $2.4$ GHz, they can be reduced to 1.5\\% in $7.2$ GHz. This is because the scanning frames are used to collect the channel information of the whole area, which is not available in non-scanning frames. The channel estimation error is caused by the fact that the channel estimation is performed by the receiver, which is located at the same position as the transmitter. Therefore, the channel estimation error is caused by the fact that the channel estimation is performed by the receiver, which is located at the same position as the transmitter. Therefore, the channel estimation error is caused by the fact that the channel estimation is performed by the receiver, which is located at the same position as the transmitter. Therefore, the channel estimation error is caused by the fact that the channel estimation is performed by the receiver, which is located at the same position as the transmitter. Therefore, the channel estimation error is caused by the fact that the channel estimation is performed by the receiver, which is located at the same position as the transmitter. Therefore, the channel estimation error is caused"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "isa} it is shown that the CME is not a stationary solution of the MHD equations, but rather a non-stationary solution of the MHD equations with a small perturbation. The non-stationary solution is a solution of the form $\\psi(t,r,\\theta,\\phi) = \\psi_0(r,\\theta,\\phi) + \\varphi(t,r,\\theta,\\phi)$, where $\\psi_0(r,\\theta,\\phi)$ is the stationary solution and $\\varphi(t,r,\\theta,\\phi)$ is a small perturbation. The perturbation $\\varphi(t,r,\\theta,\\phi)$ is a function of time, and therefore, the solution of the form $\\psi(t,r,\\theta,\\phi) = \\psi_0(r,\\theta,\\phi) + \\varphi(t,r,\\theta,\\phi)$ is not a stationary solution. The stationary solution is a solution of the form $\\psi(t,r,\\theta,\\phi) = \\psi_0(r,\\theta,\\phi) + \\varphi_0(r,\\theta,\\phi"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher:2010:EI:1853007,Kaltenbacher:2011:EI:1973001,Kaltenbacher:2012:EI:6301001,Kaltenbacher:2013:EI:6401001,Kaltenbacher:2014:EI:6501001,Kaltenbacher:2015:EI:6601001,Kaltenbacher:2016:EI:6701001,Kaltenbacher:2017:EI:6801001,Kaltenbacher:2018:EI:6901001,Kaltenbacher:2019:EI:7001001,Kaltenbacher:2020:EI:710"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " of the local minimum.  In the case of Landweber iteration, the local minimum is easily identified by the condition $f'(x)=0$.  However, in the presence of noise, the iteration $x_{n+1}=x_n-f(x_n)/f'(x_n)$ may converge to a spurious local minimum, which is not a local minimum of the function $f$.  This is the case for the function $f(x)=x^2-1$ and the initial point $x_0=1/2$.  The iteration $x_{n+1}=x_n-f(x_n)/f'(x_n)$ converges to $x_n=1/2$ and $f(x_n)=1/4$ for all $n$.  This iteration is not effective for the function $f(x)=x^2-1$ because it does not converge to a local minimum of $f$.  The iteration $x_{n+1}=x_n-f(x_n)/"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the sequence $\\left( \\mathbf{x}_{t}\\right) _{t=0}^T$ to a neighborhood of the set $\\mathcal{X}_{\\mathrm{s}}$. In practice, one can easily observe that the sequence $\\left( \\mathbf{x}_{t}\\right) _{t=0}^T$ converges to a point $\\mathbf{x}^*$ that is a local minimizer of $\\mathcal{J}\\left(\\mathbf{x},t\\right)$ for all $t\\in [0,T]$. In other words, $\\mathbf{x}^*$ is a local minimizer of $\\mathcal{J}\\left(\\cdot,T\\right)$ and $\\mathbf{x}^*$ is a local minimizer of $\\mathcal{J}\\left(\\cdot,t\\right)$ for all $t\\in [0,T]$. Therefore, we can say that $\\mathbf{x}^*$ is a local minimizer of $\\mathcal{J}\\left(\\cdot,t\\right)$ for all $t\\in [0,T]$. In other words, we can say that $\\mathbf{x}^*$ is a"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the boundary conditions on the boundary $\\partial \\Omega$, $j=1,2,3$. The boundary conditions are given by $g_1(x) = \\partial_n u(x)$ on $\\partial \\Omega$, $g_2(x) = \\partial_n u(x) + \\nabla u(x) \\cdot \\nu(x)$ on $\\partial \\Omega$, and $g_3(x) = 0$ on $\\partial \\Omega$, where $\\nu(x)$ is the unit outward normal vector on $\\partial \\Omega$. The boundary conditions are applied to the solution $u$ of the problem. The boundary conditions are applied to the solution $u$ of the problem. The Dirichlet problem is defined by the conditions $u = u_D$ on $\\partial \\Omega$, where $u_D$ is a given function. The Neumann problem is defined by the conditions $\\nabla u = \\nabla u_N$ on $\\partial \\Omega$, where $u_N$ is a given function. The Robin problem is defined by the"
        }
      ]
    },
    {
      "train_loss": 1.122171875,
      "update_counter": 500,
      "epoch": 16.096774193548388,
      "val_loss": 3.111328125,
      "val_entropy": 1.46044921875,
      "val_ttr": 0.5032552083333333,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the performance of the \\text{WholeTime} algorithm for different values of $N_w$ when using the \\text{FliF} configuration (configuration 1) and the \\text{LiiF} configuration (configuration 2). In both figures, the results for the \\text{LpF} configuration are shown in blue, the results for the \\text{LpF} with $\\varepsilon=1$ in red, and the results for the \\text{LpF} with $\\varepsilon=5$ in green. The results for the \\text{FliF} configuration are shown in orange when $\\varepsilon=1$, and in brown when $\\varepsilon=5$. The curves that are close to the top represent the results of the \\text{LpF} with $\\varepsilon=1$, which are not optimal. The results that are close to the bottom represent the results of the \\text{LpF} with $\\varepsilon=5$, which are optimal. The results that are close to the top in the figure~\\ref{comparison_whole_timeline"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for the case $L_i = 3$ and $L_i = 5$ in case $N_i = 10$ and $N_i = 20$. The residuals are averaged over all subproblems. The residuals are very smooth and oscillate around a constant value. This is due to the fact that the time meshes are uniform and the solution is not strongly dependent on the time step. Therefore, we use a second-order spline interpolation to fit a cubic function to the residuals and compute the standard error of the estimate. The standard error is very small and amounts to $SE = 0.11$ for $L_i = 3$ and $SE = 0.05$ for $L_i = 5$ for the case $N_i = 10$. For the case $N_i = 20$ the standard error amounts to $SE = 0.02$ for $L_i = 3$ and $SE = 0.02$ for $L_i = 5"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall solution of the coupled system is obtained by a final, time-synchronized solution of the coupled equations. In the context of the proposed algorithm, the overall solution is obtained by a single iteration of the above procedure. The overall solution can be further accelerated by exploiting the symmetries of the problems. For the fluid problem, the Stokes equations are invariant under rigid body rotations and the boundary conditions are invariant under translations. In the solid problem, the equations of motion are invariant under rigid body rotations and the boundary conditions are invariant under translations. In each of the multirate time-stepping schemes, the solution of the problem is accelerated by a fast Fourier transform (FFT) of the domain. In the solid problem, the solution is further accelerated by a fast Fourier transform of the boundary conditions. The FFT of the boundary conditions is updated at every time step, while the FFT of the domain is updated at every solid time step. The overall solution is obtained by a fast synchronization of the two FFTs. The synchronization is done by a fast Fourier transform of the residuals of the problems. The residuals are updated at"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% on the first floor, they can be reduced to 1\\% on the third floor. This is due to the fact that the user scans are recorded in the database and the mobile can verify the presence of the same RF signal in the \\ac{RSS} measurements on different floors. In the case of $3$ GHz, the highest reduction in the floor error is achieved on the first floor, where the scanning is effective in reducing the floor error from 18\\% to 1.5\\% (see Fig. \\ref{fig:13}). However, the floor error on the second floor is reduced from 10\\% to 1.5\\% (see Fig. \\ref{fig:14}). This is due to the fact that the \\ac{FCC} has reserved this band for WiFi-6, and the maximum channel width is 120 MHz. Therefore, the \\ac{RSS} values on different floors are similar, and the mobile can use the \\ac{AoD} information on different floors to identify the right user. In"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": ",Valgushev:2016}, it is shown that the inertia of the CME is predominantly provided by the solar wind. The acceleration of the solar wind is directed towards the ecliptic plane, and the mean field magnetic field points radially. Thus, the crossing of the CME by a field line causes a significant acceleration, which results in a greater energy of the CME. This effect is stronger for shorter durations CMEs, which is consistent with the findings of \\citet{2021ApJ...913..157L}. In addition, the authors of \\cite{Valgushev:2015} show that the acceleration of a SBO-CME is much weaker than of an Alfv\\'en-CME. Therefore, the acceleration of a SBO-CME is dominated by the gravity of the Sun, which results in a much lower energy. The gravity of the Sun is much weaker than the inertia of the solar wind, therefore, a small deflection of a SBO-CME by a field line results in a significant change in the trajectory."
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher-Schuett-Weber:2012,Steiner-Weber:2012,Weber:2013}), so far it has not been systematically understood under which conditions the scalar curvature condition \\eqref{scalarconcondition} will also hold. In this paper, we will show that the scalar curvature condition \\eqref{scalarconcondition} holds if the Jacobi determinant condition \\eqref{jacobicondition} holds and if the Ricci curvature of the metric $g$ is bounded from below. In particular, we obtain the following result by combining the works of \\cite{Ma:2003pc} and \\cite{Hertel:2012}: Let $(M,g)$ be a compact Riemannian manifold of dimension $n\\geq 3$ with bounded Ricci curvature from below. Then the tangential cone condition \\eqref{tangentialconecondition} implies the scalar curvature condition \\eqref{scalarconcondition}. In particular, if the scalar curvature $R$ of the metric $g$ is bounded from below, then the local product structure"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " of the step-lengths $\\sigma_k$'s. In the Landweber iteration, the global convergence is guaranteed by the rule $\\sigma_k = 2^{-(k+1)}$ \\citep{Landweber1978}, which is, however, not very efficient in practice. In our numerical tests, we observed that the rule $\\sigma_k = 2^{-(k+1)}$ produces a local minimum in the objective function of the parameter vector $b$ at iteration $k=3$, which is far from the global minimum. The gradient method is used to escape from this local minimum. However, the gradient method is not able to escape from this local minimum, and the iteration is trapped in this local minimum. We escape from this local minimum by using the line search method. We choose the step-length $\\tau_4 = 0.01b$, and then proceed with the Landweber iteration. The iteration converges to the global minimum within 10 iterations. We note that the iteration would also converge to the global minimum if we choose a larger step-"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the sequence $x_{k}$ to an \\emph{equilibrium} or a \\emph{stationary solution}, which is independent of the initialization $x_0$. In contrast, in the linear case, the convergence of the sequence generated by the regularized Newton method is dependent on the initialization $x_0$. In fact, the initialization $x_0$ only needs to be close to a solution of the system of equations given by the linear system and the linear system associated with the linear terms in the right-hand side of the Newton equation. In this paper, we show that the regularized Newton method is also an IFC-NC method, which is a strong evidence supporting the local convergence in the nonlinear case. The convergence of the sequence generated by the regularized Newton method is also independent of the initialization $x_0$ in the sense that the initialization $x_0$ only needs to be close to a solution of the system of equations given by the linear system and the linear system associated with the linear terms in the right-hand side of the regularized Newton equation. In the linear case, the linear"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the inhomogeneities. The above problem is usually reformulated in the following way: for a given $u \\in W_0^{1,2}(\\Omega)$, we look for the weak solution $u \\in W_0^{1,2}(\\Omega)$ to the Poisson equation \\eqref{eq:parabolic_diffusion_equation} with Dirichlet boundary conditions. The existence and uniqueness of such solutions can be shown by the Lax-Milgram lemma \\cite{Lax1950,Milgram1949}. The uniqueness of the weak solution is also related to the maximum principle \\cite{Lax1950}. The boundary value problem \\eqref{eq:parabolic_diffusion_equation} is called \\textit{well-posed} if the unique solution $u \\in W_0^{1,2}(\\Omega)$ depends continuously on the data, i.e., $u$ is continuous in the topology of $W_0^{1,2}(\\Omega)$. In this case, the solution can be recovered from the given data $u_D$"
        }
      ]
    },
    {
      "train_loss": 0.575875,
      "update_counter": 750,
      "epoch": 24.161290322580644,
      "val_loss": 4.1630859375,
      "val_entropy": 1.111328125,
      "val_ttr": 0.5836588541666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the maximum likelihood trajectory of a single star in the Solar Neighborhood as obtained with different selections of the \\ifmmode\\mathrm{Ly}\\alpha\\else{}Ly$\\alpha$\\fi{} target library. The figure shows three epochs:~(1)~the pre-explosion epoch,~(2)~the explosion epoch and~(3)~the post-explosion epoch. The adopted time in the past for the W-R explosion was set to be 17 Apr. 1994 (see Section~\\ref{sectors} for more details). In all the figures we highlight the position of the star by showing the orbits of various stars in the whole Timeline library. The figure captions list the maximum likelihood trajectories for the following planets:~(1)~Mercury closest to Earth's perihelion,~(2)~Venus closest to Earth's perihelion,~(3)~Mars closest to Earth's perihelion,~(4)~Jupiter at the time of the W-R explosion,~(5)~Saturn at the time of the W-R explosion,~(6"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the whole period $T$ of the simplified New Orleans flow, with approximately equal time steps. We also let $h_T/h_T=1$ so that the total fluid run time $T$ and the total fluid time step $Th_T$ are identical. Hence, the number of flow steps $N_F$ in the flow is $N_F=NT=300$. First, we observe that the a posteriori error is locally optimal with respect to $T$ within the range $0.01\\leq T\\leq 0.2$. We also observe that the a posteriori error is slightly lower on the meshes $N_T=0.01$ and $N_T=0.02$ than on the mesh with uniform time steps $N_T=0.16$ in the flow. This is probably due to the fact that we use a smoother preprocessor $P$ that captures all the cycles in the period, instead of aarser preprocessor as done in \\cite{Pusateri20"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method in the case of joining cylindrical bars, i.e., for $\\mathrm{Ba}=1$, is highlighted in the figure \\ref{fig:7}. The direct solution of the problem in Newman \\cite{DBLP:conf/stvr/Nawata00} is used for validation. The output of the function of interest is the contact force per unit length at the contact edges, a representative quantity for wear resistance of the bars. The use of time-stepping allows for large-scale investigations of the effect of joining parameters on the expected lifetimes. For the same joining parameters, the cylindrical bars have an expected lifetime of around $10^{7}$ cycles, while the truncated hexagonal bars have an expected lifetime of around $10^{8}$. The results of the element-free approach are within $20 \\%$ for the cycles, while the finite-element validation is within $10 \\%$ for the diameter and $30 \\%$ for the cycles. The results of this work are expected to be useful for the design of joining processes of slender"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "1\\% down to $1.8\\%$ in August 2013. The remaining errors are due to the presence of the antenna under-coverage issues. On the other hand, the results of the simulation show $100\\%$ error rate due to the inaccuracy of the $\\mathbf{x}$ matrix, which is a direct result of the air-hacking approach, presented in \\eqref{eq: estimation kpp }. In the following, we demonstrate the performance of the proposed method in real-life, by comparing the estimatedKPIPR with the RF-domain k-factor method. The comparison is presented in the form of a voiding map, for a site in the center of the city, in Figure \\ref{fig: voiding}. We can see that the proposed method of estimating KPIPR results in much lower values of power penalty in the digital and RF-domain methods, compared to the results of the in-service algorithm. This confirms the effectiveness of the proposed method in estimating the KPIPR, and the effectiveness of the proposed method in estimating the small power penalties"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": ",Valgushev:2016} it is shown that the same coronal hole from which the CME exited eminated a second CME on 20 Jan. 2015. The two CMEs showed similar velocities but flew from different regions of the corona and along different paths to the Sun.  As a consequence, their signatures in the CHL \\citep{Horne:1989} also differed: for the second CME there were no or very small brightening signs at the position of the LSNC \\citep{B\u00e9\u00efque_1989,Simon:2006}.  For the first CME there are large brightening signatures at the same position but they occurred at a different hour.  Thus, in the dynamic range of 12-24 minutes the signature resembles the brightening NBY \\cite{1976PNAS...73...105C}, in the range of 24-96 minutes the signature resembles the brightening L1 \\citep{1976ApJ...205.."
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2011,Jonsson2019,Guo2021,This2017} and references therein), the corresponding level set $F_r(x) \\leq c$ has not been used before in problem (\\ref{elasticproblem}). Here we observe the convenience of using a generalized level set formulation and thus directly minimize $\\fr$  such that $F_r(x) = c$ holds. Indeed, for $r = \\bullet$, this \\emph{channel}  condition \\eqref{channelcondition} is essential to obtain different solutions for the same given boundary conditions, compared to the traditional WCC formulation. This channel condition was discovered by\u4faf\u9633 Yang \\cite{Kaltenbacher2011} when investigating solutions of \\eqref{elasticproblem} for electromagnetic problems. Furthermore, the function $c = \\langle \\bfp, \\bfx \\rangle/\\rho$ has the advantage that it is well-defined for all choices of the reference density $\\bfr$ and hence does not involve any regularization. For a"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " of the eigenvalues of $\\bm{X}^T\\bm{X}$ and $\\bm{D}\\bm{X}$. The Landweber iteration is one of the simplest and stable iterative method and it is adopted in many scientific papers \\cite{r2l\\cite{r2l,r2l_5,r2l_6,r2l_7,r2l_8}. However, this iteration does not guarantee convergence to an optimal solution when $\\bm{s}$ is small. For an example with $\\bm{s} = 0.05\\cdot\\bm{d}$, which is enough to guarantee the convergence of the iteration \\cite{r2l_7}, we still find a wrong solution when the $k$ is set to be 1 (see Figure \\ref{fig: Landweber sp}). In this case, the optimal $\\bm{t}_k$ generated by the perturbed $\\bm{X}$ and $\\bm{s}$ is $\\bm{t}_{k} = \\bm{d}$ while the optimal $\\bm{"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the sequence $x_{k}$ even if the parameter vector $r_{k+1}$ is related to $x_{k}$ through $r_{k+1}=F_{f}(r_{k+1;x_{k})}$. This is in contrast with the linear case where one can prove the convergence of the sequence $x_{k}$ even if the parameter vector $r_{k+1}$ is independent of $x_{k}$. As we will see in the following, this local convergence theory enables a simple proof of Theorem \\ref{thm:convergence_NN_reg} even in the nonlinear case. We would also see in the following that the convergence rate in Theorem \\ref{thm:convergence_NN_reg} is in fact identical to the one in more general hybrid algorithms under local convergence in the parameter vector $r_{k+1}$. See \\cite{carlsson2018hybrid} for details. The advantage of the semi-optimal convergence rate in Theorem \\ref{thm:convergence_NN_reg} over the optimal one in \\cite{"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the local structure of the boundary. The $g_j$'s are often referred to as \\emph{jump-conditions} or \\emph{flux conditions}. The coefficients depend on the boundary data and the choice of the boundary conditions. We say that the problem on $\\Omega$ is \\emph{subcritical} if $0< \\mu_1 < \\mu_2 < 1$, critical if $\\mu_1 = \\mu_2 = 1$ and supercritical if $\\mu_1 > \\mu_2$. Throughout this paper, we consider the boundary conditions $u \\rightarrow \\bu$ in $\\Omega$ and $u \\rightarrow \\bu$ on the boundary with \\emph{no slip}, i.e., $u|\\partial\\Omega=\\bu|\\partial\\Omega$. For no slip on the boundary, the jump conditions are of the simplest form and hence the problem is the hardest. For no slip on the boundary, we assume that $\\bu \\in H^1_0(\\Omega)$ is the solution of $-\\Delta \\bu + \\bu=0$"
        }
      ]
    },
    {
      "train_loss": 0.244703125,
      "update_counter": 1000,
      "epoch": 32.225806451612904,
      "val_loss": 4.8798828125,
      "val_entropy": 0.935546875,
      "val_ttr": 0.5699869791666666,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the performance of your recommender system in two different values of $\\beta$ and for two different values of $n_u = \\{2, 4\\}$, as described in the previous figures. In the first panel, we have $n_u = 2$ and $\\beta = 0.05$, meaning that our algorithm runs in $17$ seconds for $1000$ users, as opposed to $333$ runs in recommendation modules because $p = 1000/2 = 500$. In this case, the additional power of $\\beta$ in the shuffling would be to add an additional $32$ more runs, which means an additional $16$ seconds, for an overall $17 + 16 = 33$ seconds more than the optimal case of $\\beta = 1$. In the second panel, we have $n_u = 4$ and $\\beta = 0.1$, meaning that our algorithm runs in $43$ seconds for $1000$ users,"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $3$-dimensional cube. The mesh size $H$ varies proportionally to the inverse of the control parameter $\\kappa=100H$, so that $\\kappa\\Delta t=10^{-3}$. The first two rows show the case where the flow is incompressible, while the last two rows show the case of incompressible flow with a non-constant viscosity. In the last two rows, the fluid is Patankar's model for laminar flow \\cite{patankar1980method}, whereas in the first two rows it complies with Lamb's theory for incompressible flow \\cite{lamb1948equations}. In the 3rd and 4th columns we show the a posteriori error for the discrete flux $\\int_K v(t)\\phi(x,t) \\mathrm{d}x$ estimated using the discrete divergence on the grid level and the computed flux $\\int_K v(t)\\phi(x,t) \\mathrm{d}x$, computed on all grid points over the whole interval, in terms"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy in the solution of the coupled problems depends on the solid \\textit{versus} the fluid solver, with choices ranging from high-optimal finite element methods of order $O(h)$ in the solid \\cite{Levin2010,Hochstenbach2010,Hochstenbach2011}, to semi-analytical methods of order $O(n?h)$ in the fluid \\cite{Schlebach2007,Zajac2008,Zajac2010}. Regardless of the choice of solid/fluid order of accuracy, a discretization of order $O(n?h)$ is expected to provide sufficient accuracy for micro-scale applications with single particle resolution \\cite{Schlebach2007,Zajac2008,Zajac2010}. Apart from the overall overall accuracy, a further development is ongoing. Levin and Zienkiewicz \\cite{Levin2015} proposed a new discrete stress tensor formulation for the solid \\textit{"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "1\\% down to $1.5\\%$ in $2.4$ GHz and from $10\\%$ to $4\\%$ in $70/80$ GHz. Furthermore, we observe that although the time-averaged overlap between the two imaging frames is small, averaging over the frames effectively reduces the scatterer densities on the floor with a mobile vehicle, which is reflected in the reduction of the average SEP. Again, this improvement in the image quality is attributed to the scanning frames, which reduces the noise in the images, allowing the RS-CCA to accurately find the minimum-distance matrix sub-matrix which contains the scatterer image. Also, as the vehicle moves in the network and different vehicles perform images, the centimeter-level road of different floors have mixed portions of the road covered by the image, which, if the method used here is applied, would result in a road image, which is actually a combination of multiple roads with different lengths. However, if the same reference frame is enforced for all the images in the image pyramid, the effect of cross-floor path-integration is mitigated"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}(Evan et al 2015, private comm.), based on the pictures of the CME before it became obscured by cloud debris, Peter Evan suggested that it was a single structure, associated with two distinct brightening regions at the two poles. It was observed by {\\emph{STEREO}}-A and {\\emph{SOHO}} and was moving towards the Sun towards Earth. It was a large structure, the largest for some time before consideration was give to asking nations to help in the rapid analysis and urgent eruption notification so the event could be observed as it transited the Sun. The fact that it was observed as a single structure by two S/C (although one at a nearly east-west crossing point, the other nearly north-south) means that the consensus of the analysis will be that this was a single large structure. However, in a private communication with authors after {\\emph{PSP}}'s first solar Enc., Evan admitted that this complex structure was the result of a number of smaller eruptions from a large association of sources rather than a single source producing the many structures"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Jonsson2015,Shang2017,Shang2019,Hochstenbach2010,Coron2013,He2017,Paffard2020} as well as in the case of quantum computing and state preparation \\cite{Nawata1996,Vishwanathan1988,Vipin2012,Pati1991,Sudip2014}), for the case of coupled oscillators in physics and neuroscience it represents a canonical property and has been used to describe emotional states of animals \\cite{Hocking1996,Hocking2000}, human behavior \\cite{Tas1997,Yonekura2000,Yonekura2002,Morisse2001,Morisse2002,Morisse2004,Imasato2007,Imasato2008,Im"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the $f(\\bm{x}_{n+1})$ and $g(\\bm{x}_{n+1})$ terms. In the case of the noisy data $ \\bm{x}_{n+1}$, we note that the matrix $A_{n+1}$ is non-positive always, and so the effective expression to determine the effective performance of the Landweber iteration is  $f(A_{n+1})+g(A_{n+1})=-f(\\bm{x}_{n+1})+g(\\bm{x}_{n+1})<-C_{n}+b$ for any $C_{n}>0$ and $b>0$. This effective performance is always negative some value no matter how large the initialization $\\bm{x}_{0}$ is. This happens because of the imaginary parts of the noise in the coefficients of the data and the effective filter $A$ due to the low-rank property. In fact, the Landweber iteration provides a good correction path to the spurious first minimum. However, it cannot be verified if the iteration produces a"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " for the parameter matrix, which is locally convergent, and the function tuples, which is convergent relative to the initial guess, but \\a0 from the initial guess may have already been not zero after the iterative procedure has run for a small number of iterations. In other words, \\a0 may always be guaranteed to satisfy \\texttt{max(|$\\mathbf{\\alpha}$|, |$\\mathbf{\\beta}$|):0} by maintaining a fixed threshold. Thus, we can use any initial value for \\a0 as long as it satisfies this constraint. On the other hand, the Lipschitz continuity of the function gradient in \\cref{eqn:linear_lip_condition} is also applicable in the nonlinear case, and thus the Lipschitz continuity of the parameter matrix can be used to guarantee the local convergence of the parameter matrix. The local character of the convergence in the Lipschitz continuity of the parameter matrix means that the product of the linear operator $\\mathbf{A}(\\mathbf{w}_{n+1}^{k})-\\mathbf{A}(\\mathbf{w}_{n}^{k})$ and the gradient $\\nabla f(\\mathbf{w"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the localized mass of the solution. We have $u \\in H_0^1(\\Omega) \\cap L^p(\\Omega)$ for $1< p<\\infty$, and $u \\in H^2(\\Omega)$ otherwise. For $d = 1,2,3$, the problem is trivial for sufficiently small $m_j$, very hard for large $m_j \\in (0,0^\\infty]$, and, for the time being, unknown for very small $m_j \\in (0_{-},0^\\infty)$. The reason is that, for $m_j \\in (0^+,0^\\infty]$, one has $u=g_1$ (trivial problem), for $m_j \\in (0_{-},0^\\infty]$, we have $u \\equiv g_1$ (very hard problem), and for intermediate values of $m_j$, we have $u \\sim e^{-m_j|\\Omega|}g_2$ (hard problem). In particular, we expect that for $m_j \\in (0"
        }
      ]
    },
    {
      "train_loss": 0.11223046875,
      "update_counter": 1250,
      "epoch": 40.29032258064516,
      "val_loss": 5.33203125,
      "val_entropy": 0.846923828125,
      "val_ttr": 0.6217447916666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the change of the prediction error and that of the relative error for both algorithms for both configurations as a function of the number of views for both regression problems. The upturn in the error for the relative error track in Figure~\\ref{comparison_whole_timeline_configuration_1} indicates that a major problem that both algorithms face are incorrect predictions or divisions by zero due to missing data or extremely large values in the data. In both problems, regression learns faster bi-directionally, i.e., $t_{user} = 0.25$ and $t_{server} = 0.0$, but this is not true for all timestep values. Moreover, we observe that the relative error starts decreasing (and the corresponding value of the error start increasing) sooner for $\\lambda_k = 3$ than for $\\lambda_k = 1$. Finally, we note that algorithm PRG shows a better behavior than algorithm BG in both problems for a large range of timestep values. In particular, note that the error in both problems start increasing sooner for BG than for PRG."
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the whole interval~$[0,T]$ with fixed mesh step~$h$. By Eq.~\\eqref{eq:assemblyAfp} the coefficient matrix~$A_T$ in Eq.~\\eqref{eq:fluid_residuals_uniform_equal} is calculated explicitly for each time step. In our simulations we obtain the following sequence of meshes$:~$ $ \\{Q_0 \\} \\cup \\{ Q_{n+1} \\} \\cup \\dots $ with successively refined until the norm~$\\||u-L^n(u)|||=1$ gets down to~$\\approx 10^{-10}$. From Table~\\ref{fluid_residuals_uniform_equal} it follows that even though the coefficient matrix~$A_T$ is calculated here explicitly, the a posteriori errors are in terms of the approximation errors on uniform meshes quite robust and comparable to the errors on triangular meshes. In addition, we observe that the error basis is stable in the mean-square sense and it converges to an optimal finite dimensional approximation of the space~$C^0_T$ for larger number of time"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy will be that of the solid solver which remains below the global accuracy expected to be satisfied by the FEA framework. The accuracy will be further discussed in Section~\\ref{sec:as-ass} in context to numerical experiments. The fluid solver will evolve with time steps large enough to obtain a sufficient resolution during the fluid time. This is done by monitoring the position of the particle, obtained from COMPACTIFICATION \\cite{ermeshanskiy2018compactification}, estimated time until a collision with a solid element, or detection of a sinker deep bottom point, development of draft, or obsolescence. All this information is provided to the fluid subproblem by the solid subproblem through a \\emph{run-time system description}, i.e., a dynamic model of the system during simulations. The construction of the system description is discussed in Section~\\ref{sec:sys_desc}. Section~\\ref{sec:ref_obs} then reviews the reference observation method which is used to communicate reference state (and therefore time) information between the fluid and solid subproblems"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "1\\% down to $10\\%$ in the RSL images. However, though the median length of the skeletons in Fig.~\\ref{fig:h_images} seems to be around $10$, the eyes actually move across the beamformers much faster, i.e., on the order of $100$ samples (images). To demonstrate this, we computed the locations of the median line across all frames in $2.4$ GHz over $90$ seconds and connected the corners. This is shown in Fig.~\\ref{fig:sl_images}. Moreover, although variations in the median line can exceed $90^\\circ$, the actual position of the receiver. Such large variations indicate that the users are moving their hand in different directions or the people around them are moving, which interferes with the SL processing \\cite{user} and therefore, the estimated beamformer images are noisy (\\figref{fig:n4}). In contrast, fluctuations of the median line in $5.9$ GHz were usually between $20-50^\\circ$ and the connected"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": ",Valgushev:2016} it is shown that from  the energy conservation and the magnetic field alignment point of view the HCS is a continuum and that numerous \"islands\" of HCS exist in the corona and solar wind. An island is an area where the rotation direction of the magnetic field remains in line with the current of the field (in the Parker spiral framework, the field rotates in the opposite direction to the solar wind wind flow thanks to the Parker spiral). An island can be characterized as a region where a coronal magnetic field structure is active (regions where the flux ropes are forming and propagating) or where the flux ropes are dormant (regions where flux ropes are dormant). In the latter case, the process of the magnetic field restructuring, to reconnect the HCS, is reactivated when a new flux rope jumps out from the region in some future solar cycle, and propagates outward. In the former case, the flux ropes active regions propagate outward directly, without any external reactivation. It is evident that such regions do not exist in infinite sizes, and that the boundaries of the islands must"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Giacobini2010,2015ApJ...798...80B,2019MNRAS.489.3584Z}) the computational focus so far has been on the \\emph{weak tangential cone condition}. This constrains the vertices of the $\\b$ space partition of unity with respect to the tangential cone condition of degree $m-1$ in some arbitrary direction $\\boldsymbol r\\direction$, i.e., the vertex function $\\nu_{\\boldsymbol r}\\colon\\mathbb{R}^{3}\\to\\mathbb{R}$ is such that it is greater than or equal to 0 and monotonically increasing in $\\boldsymbol r$. The corresponding geometric interpretation is that the requirement that the star shall have radius $\\radius$ and center $\\center$ must be met only in the in the tangential plane spanned by $\\center$ and the direction $\\boldsymbol r\\direction$. The discrete version of the weak tangential cone condition is the so-called \\emph{dispersion limit}, defined as the case when the vertex density $\\"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the $f(\\bm{x})$ and $J(\\bm{x})$. In most problems we are interested in, the Landweber iteration does not change the quality of the generated solutions. The heuristic methods relies on efficient evaluations of $f(\\bm{x})$ and $J(\\bm{x})$. For the $m$-th Landweber iteration we assume these evaluations to be efficient. However, for some problems the effective $f(\\bm{x})$ evaluations may include several high-order terms. In such problems, the Landweber iteration starts to explore the high-dimensional local\u978d\u70b9 and outputs a new solution that is closer to the local minimum. The iteration then produces a new solution that is closer to the true solution. However, it happens that the effectiveness of the $J(\\bm{x})$ estimates and effective $f(\\bm{x})$ evaluations are domain dependent. In some cases, the effective $f(\\bm{x})$ evaluations include only few low-order terms and the $J(\\bm{x})$ estimates are simple and effective. Therefore, the output solution of"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " consistency of the NCA industrial problem~\\eqref{eq:n_a}, ~or the NCAGC problem~\\eqref{eq:n_b} when the parameter $\\theta$  is  localized  in some large scale but explicit way. The very recent paper~\\cite{nva2020} presents such a local convergence theory and beats all non-linear simulation based convergence theories in readability and rigor. See Section \\ref{sec:compare-with-va} for more details.  In particular, the functional equation for the convergence stability given in Theorem \\ref{eq:the_s_func_linear} still holds in the localized parameter case. For example, if we have a centralized parameter $\\theta_i = \\bar{\\theta }_i + \\theta^{\\rm inc}_i$, where $\\bar{\\theta} _i$ is the center and $\\theta^{\\rm inc} _i$ are the incremental deviations, the convergence stability function on a subiteration $k$ is still given by $f_{\\infty}(\\bar{\\theta} )[k] -"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the resistance of blood vessels, which is supposed to follow a force-matching procedure on each time step as explained in Section \\ref{sub: force-matching}. The $J(u)\\ , u\\in \\mathbb{R}^N$ is the Jacobian of $u$, and $\\Delta_c  \\in \\mathcal{M}$ with $\\mathcal{M} \\subset \\mathbb{R}$ is the central nervous system force, which models cutaneous sensation and movement intentions in a force representation, similar to the work of Brandt  and Bucy \\cite{bucy1972neural}. In a classical theory of neuroscience, the nervous system force $\\Delta_c$ is multiplied by a constant $c>0$ to capture the ``force dose'' effect: the stronger the force, the less likely to receive motion. More recently, competition from other senses within the nose has been proposed to limit the intensity of the sensations, which is modeled by a linear barrier $u \\lessapprox -c$ \\cite{romo2011neuroscience}.   Since"
        }
      ]
    },
    {
      "train_loss": 0.0738837890625,
      "update_counter": 1500,
      "epoch": 48.354838709677416,
      "val_loss": 5.640625,
      "val_entropy": 0.803466796875,
      "val_ttr": 0.5963541666666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " comparison between our algorithm as described in Section~\\ref{ calibrated_timeline_configuration } and the algorithms described in Section~\\ref{expl:calcomprob} for creating a timeline configuration in Case 1 (Roman numbers) and Section~\\ref{generated_timeline_configuration} for creating a timeline configuration in Case 2 (letters). In both figures, the curves timelines created by the algorithms for creating a timeline configuration for both cases are hidden under the curves created by the algorithms for creating a timeline configuration using the \\textit{configuration scale parameter $\\beta$} given in Table~\\ref{configuration_scale_parameter}. Therefore, from the figures, we can see that the proposed algorithm in Section~\\ref{ calibrated_timeline_configuration } produces the same timeline as the optimal timeline using statistical analysis such as cross validation. For example, in Figure~\\ref{comparison_whole_timeline_configuration_1}, the optimal timeline such that $P_{\\mathcal{S}}(S_t(n_t,j)) = 1$ for all $j \\in \\mathcal{N}$ for both Case 1"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the domain $\\Omega = (0.1,0.3)$ in the case when the flow is a simple\u74e6\u76c6-\u53cc\u5706\u7403 flow (see Table~\\ref{reconstruction_data}). For the a posteriori residuals we use the weakest among the residuals for each variable $u$ and $v$, which approximate $u$ and $v$, respectively. From what follows, we assume that the error estimate is smaller than $\\frac{10}{120} \\frac{\\log(t_{n+1})}{\\log(t_{n})} \\min_{1 \\leq i \\leq 5} \\e_i \\frac{ \\lambda_{n} + \\lambda_{n+1} }{ \\lambda_{n} $, where $\\lambda_{n} = t_{n} \\frac{1}{2} \\left(\\sum_{i=1}^4 \\ell_i^2 + \\ell_{0,5}^2\\right)^{1/2}$ is the time step size of the uniform mesh $\\Omega_n$ in seconds, and $\\"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In the appendix we describe a solution of the solid subproblem which uses a finetuning procedure based on adaptive refinement in both phases of the coupled system, such that even complex solid substrates can be modeled robustly. The idea behind the method is to use essentially finite-element solution of the solid subproblem at the same grid-points as provided for the fluid solver by upper bounds for the fluxes at intergrid-points in a compact formulation. Since the finetuning procedure uses a local improvement criterion based on the solution error in both phases, no additional run-time information is needed. Such a solid subproblem can even be used as a fast check for anomalies in the solution of the fluid solver when used as a verification procedure for the higher-accuracy fluxes between grid points. Details of the procedure are given in \\cite{Frougny2007} and The\u7ae0\\ref{TheSolidRatePRB} shows in detail how the solid subproblem adapted to high-performance multirate schemes looks like for rates as low as once per time step of a high-performance DFT scheme. The"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "1\\% on the $2.4$ GHz channel map in Figure \\ref{fig:channel-map-2.4}, while in the $10$ Hz frames the error rate is estimated to be the same as the floor error rate of the images. So, combining the benefits of scans and the sub-scan mode, a best of both approach has been suggested, see Figure \\ref{fig:img-sub-scan}. In the sub-scan mode, one scan is added after every $51$ images. It was found that the sub-scan frame should be very short, in order to reduce the error in the number of received echoes, as in the $10$ Hz mode, so the error is sometimes caused by the error in the number of echoes rather than the image quality. In the way proposed, the error is found to be, on average, $2.5$ errors per measurement location, for a valid error count between $0$ and $3$. Sub-scans have the advantage of reducing the delay between the acquisition of the next layer of images"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, the authors propose their version of the CME equilibrium and classify them into two types depending on the location of the imbalanced field lines \u2014 approaching and departing. In the departing-IMF version, CMEs move away from the Sun and their acceleration is directed toward the interplanetary medium. This version also indicates that some of the energy that is injected into the interplanetary medium by the CME is extracted from the system and disappears into the solar wind wind wind medium. The clear conclusion of \\cite{Valgushev:2015sae} is that some of the energy from the initial $1$ MK temperature plasma configuration into the large CME extends into the surrounding interplanetary medium down to the solar wind wind wind wind temperature of $7\\times 10^3$ K. The volume of this material must be conserved, otherwise the solar wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind wind"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Magel2020}), the computational burden of implementing this condition in practice can be problematic. Note that the evaluation of the existence of a minimum always requires to compute the value of the functional target at all points of the given discretization, which typically involves computing the gradient of the functional target several times. Also, as the overall functional target may be large, it is often not feasible to directly use a LBF method to solve for the direction \\eqref{directiongivencon}. Instead, a popular alternative is to use a more robust iterative method to solve for the direction, such as the well known Newton-Gauss-Seidel (NGS) method (see e.g.~\\cite{Resadle1972,Schaub2011,Hirschbach2013Book}). However, the NGS method involves solving for the direction in the $d$-dimensional case using the constraint $\\ell_{d} = 0$ \\eqref{tangentialconecondition} by directly projecting the direction $\\mathbf{t}$ computed using the new direction"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the eigenvalues of the diffusion matrix $A$. The eigenvalues of $A$ are actually connected to the localization of the solution of the global minimization. The Landweber iteration \\cite{dk76} is one of the most popular heuristic in the field of dynamic programming. This iterative procedure start from the initialization solution which is regular (i.e. correspond to the solution at the final step of the method of gradients steps \\cite{DSW86}). The diffusion coefficient is estimated from the observed Jacobian at the solution. Then the solution is updated by a single Landweber iteration. If the initialization solution is regular, then the iteration produces a regular solution corresponding to the solution at the end of the iteration. However, under some conditions, the initialization does not need to be regular to produce a regular solution. For example, if the observed Jacobian at the solution is of the form \\eqref{eq:by}, then the revision of the initialization solution is to replace the matrix $\\Lambda(\\theta_k)$ with   \\eqref{eq:wby}, from which solution one can prove that the iteration"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of $\\{K^{l}_t\\}_{t\\in\\mathbb{T}$ for fixed $l\\in\\{1,\\cdots,d\\}$ within a neighborhood of $t_0$ for any $t_0\\in\\mathbb{T}$. This is stated explicitly in Theorem \\ref{thm:NLStrong} below. The convergence of the original sequence $\\{K^{l}_t\\}_{t\\in\\mathbb{T}$ is said to be \\emph{converging at infinity} when there exists some probability space $(\\Omega,\\mathbb{P})$ on which $\\{\\|K^{l}_t\\|_{\\infty}\\}_{t\\in\\mathbb{T}$ is bounded for all $l\\in\\{1,\\cdots,d\\}$. The time filters in the continuous case (i.e., the $L$-bit versions for the unicast problem in the linear and nonlinear cases) are even shown to converge in probability. Thus, we can derive the weak convergence of the underlying probabilities depending on the time filters and starting states. The weak convergence of class of probabilities and states imply that the rate limits of"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the diffuse parts of the pigments $\\phi_j \\subset \\Omega$ as well as a sharp interface between the different pigments $\\phi_j$. The problem addressed is how efficiently different pigments mix together once mixed.  Since imaging based approaches rely on structural information, we only consider imaging techniques that are reliable for such mixed air pollution.\" Through a review of both the mathematical and experimental theories, we are confident that the approximate sharpness of the interface between different pollutants can be described accurately using the interface parameter $t=|\\mu(\\nabla \\phi_j)/\\mu(\\Delta \\phi_j)|$, where $\\mu(\\cdot)$ can be the mean of the particle or the sun's illumination. In general, one has $t\\approx1$ for a sharp interface, and as the interface becomes increasingly blurred one has $t\\approx e^{-d}$, where $d$ is some (small) constant. It has been shown that for $t\\approx e^{-d}$, one can approximate $t$ using $\\exp(-d|x-\\cdot|)$, where the gradient and the differential are calculated over"
        }
      ]
    },
    {
      "train_loss": 0.0605830078125,
      "update_counter": 1750,
      "epoch": 56.41935483870968,
      "val_loss": 5.828125,
      "val_entropy": 0.76611328125,
      "val_ttr": 0.5986328125,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " comparison between our algorithm described in this paper and the following two methods (which are based on the work of~\\cite{wiesemann_shared_memory_and_networked_storage_2019}) in event time processing of networked storage. Both methods can be divided into three steps: computing the current values of the objects that agents converge on visiting, generating the update notifications based on the previously received messages, and checking whether the current value of an object is computed. In the first step our algorithm because it does not maintain a table of values for each object, is 4~times faster than the first version based change the value of the object only if the notification is received from a agent who has not yet executed the corresponding post-condition. The second version of the previous method differs from the first one by maintaining a set of values that have not yet been processed instead of a single value for each object. Therefore, the main difference between the corresponding if-statement is a call to the set operations. The second step of the first method and the third step of the second method are quite similar,"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for the Laplace problem with a rectangular domain. The top half of Table~\\ref{fluid_residuals_uniform_equal} is dedicated to the Eulerian discretization and the bottom half to the Lagrangian one. Note that in the upper half of the table the element volume $V_e$ goes zero while the number of its edges $N$ increases without bound. This means that the area from which we derive the residual estimates originates increasingly from the element's boundary and the solution value at the element node becomes very large by comparison. This causes growth of the error and makes further analysis pointless. On the other hand, the lower half of the table shows results for the Lagrangian discretization where $V_e$ remains much larger than $N$ and the solution value at the elemental node has a small value. We observe that the error growth is, in this case, completely different and that the estimator remains predictable in this case. We also observe, that in the Eulerian discretization the error on the uniform meshes increases with the discretization step getting stuck at a residual of 3.34 $\\cdot 10"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In the fluid subproblem the step size in terms of time $d_t$ is chosen in order to advance $n_{tf}$ time steps of the overall algorithm or, in other words, $d_t = 1/{d_s}$. The solid subproblem is updated by advancing one step of the sequence of iterative solutions of the discrete versions of the problems under study. The step size in time for the solid subproblem has to be sufficiently small such that the current solution in the current element of the solid domain does the global solution correctly. The convergence of the iterative solution procedure is of the order of $2^{-k}$, where $k$ is the number of steps~\\cite{Petrov_2009}. To guarantee that the solution does not change too much, de Solminants criterion for the convergence of the method was applied for choosing the step size in time for the solution of the solid subproblem. For more details of the method, see \\cite{Petrov_2009, Doxanlidis_2004, Doxanl"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "1\\% down to $10\\%$ in the DPC. In addition, shifted or distorted frames reduce the reliability of the DPC results on all floors and leads to an increase in the average SEP between $50$ and $75\\%$. For example, distorted frames on floor $2$ leads to an average SEP on that floor of $88.6\\%$, whereas the errors on that floor are mainly of floor type with a $SP>100\\%$. As\u4e3e\u4f8b\u5982\u56fe~\\ref{fig10}\u6240\u793a, one frame with a distorted frequency has incorrect antenna names and the other frame that is shifted has correct antenna errors but the time between the frames has passed and some antennas have changed frequency. As we previously discussed in~\\ref{subsec:reliability}, the uncertainty of the detection into the built matrix increases the reliability scores on average by $20\\%$. The reason is that, we need to extract the devices from the background noise and extract the antennas from the signal peak and both can have uncertainty. Furthermore, the closer the frames are, the harder it is to extract the individual"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "awg}, it was shown that continuity of the magnetic field and zero rotation of its radial component across the CME edge results in an absence of the equilibrium and specification of the velocity of the CME-associated field across the CME edge. It is explained by the non-zero injection velocities of plasma and fields inside and outside of the CME. Moreover, such an approach allows for the calculation of small CMEs such as those observed near the Sun and which play an important role in non-thermal ionization of the interplanetary medium \\citep[see e.g.][]{2009ApJ...701..334R,2011ApJ...736..126R}. Moreover, the equilibrium velocity is not required to be exactly equal to the local solar wind velocity, which allows to evaluate the effect of PIL crossing by a theoretical analyst who has a time meter. Both these effects lead to an evolution of the energy loss matched field alignment mechanism from the straight line to a wave like behavior. The theoretical analysis suggests that waves of magneto-hydrodynamic (MHD) type"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Magel2018}), the literature is dominated by the applications to \\emph{single} Lagrange normal forms \\cite{Kaltenbacher2014}, and in particular the D-model tensor $\\mathcal{T}^{ij}_{kl}$ is set to satisfy $\\nu_{kl}=0$ and $\\nu_{ik}=\\nu_{lk}$ when $\\ell_i=\\ell_k=\\ell_l$ and $\\ell_i=\\ell_k$ (or $\\ell_i=\\ell_l$) when $\\ell_{kl}$ is large. As a consequence, the symmetry of the tensor $\\mathcal{T}^{ij}_{kl}$ with respect to the permutations $(i,j)$ and $(k,l)$ is no longer guaranteed. This much simplified model tensor is defined by $\\13$ out of $21$ independent parameters, which means that it contains $13$ information points (or less if $\\nu_{ilk}$ and $\\nu_{ilk}$ coincide). To recover the full $180$ parameters of the"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the $f$-function and its derivatives.  Typically, the second order estimate is enough and we describe this below. The first local minimum reported in \\cref{cref:local0} and \\cref{cref:local1} were found by a heuristic procedure using the Landweber iteration and a $L_2$ error estimate for the second order Taylor expansion of the $f$-function. The error estimate is shown in \\cref{cref:error_estimate}. As shown in this image chart, this error estimate is significantly larger than 1 and the iteration direction is to the local minimum. We had discovered this surprising result by randomly starting new iterations and this situation was repeatedly encountered. To clarify why the iteration does change the direction to the solution direction, see the paper by   T. Tanaka \\cite{Tanaka-Kohmoto-Tsui-JMP92, Tanaka-Aoyama-JMP93}, which shows that the reason for the noise in the direction of the iteration is due to an statistical uncertainty concerning the matrix $ A f_1 $"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of each individual vector in the aggregated neural network state, and the overall state is not guaranteed to converge. For example, consider the following hypothetical scenario: the first layer of the neural network is synchronized to the target function $a^T \\mathbf{x}$, while all other layers are not. We claim that the overall state of the networked system (which includes errors between all layers) does not converge to any specific value. However, we do observe local convergence on the individual neural network state. We demonstrate this scenario by way of demonstration in Figure \\ref{fig:local_convergence_example}, where a target function is connected to each layer. It is clear that the overall state will never converge to be equal to the target function, but individual vectors will converge to their corresponding target value.   It is worth emphasizing that in practice, we do not care about the overall convergence of the overall state since we only care about the performance of the integrated networked system on the target function. Instead, we do require that the individual gradients \\emph{ converge } to a smaller space as discussed in Section \\ref{sec:"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the diffusion for $j\\in\\mathcal{C}$, with $\\mathcal{C}\\subset [1,2]$ is an ordinal number. We define the potential $\\phi:\\Omega\\rightarrow\\R$ as the solution to \\eqref{Eq: K-elliptic }, and $b_\\mathcal{C}\\in\\mathcal{C}^\\infty(\\Omega;\\mathbb{R})$ for $\\mathcal{C}\\subset [1,2]$ an ordinal number defines the diffusion $b_{\\mathcal{C}}$ on $\\Omega$ via the standard convolution $b_{\\mathcal{C}}*w := \\int_W w\\,dt_{\\mathcal{C}}$, with $W := \\{w\\in\\mathcal{C}^\\infty(\\Omega);\\\u9650\u5ea6\\{0\\}}$ and $dt_{\\mathcal{C}}(x)=\\displaystyle\\lim_{h\\rightarrow0}\\frac{b_{\\mathcal{C}}(x+h)-b_{\\mathcal{C}}(x)}{h}$. For $\\mathcal{C}\\subset [1,2]$ an ordinal number, we (continuously) extend $b_{\\mathcal{C"
        }
      ]
    },
    {
      "train_loss": 0.056841796875,
      "update_counter": 2000,
      "epoch": 64.48387096774194,
      "val_loss": 5.8125,
      "val_entropy": 0.787841796875,
      "val_ttr": 0.6100260416666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the change of the compliance cost resulting from our better predicting different portions of the trajectory. These figures present the compliance cost obtained by predicting different temporal information of the trajectory in the form of a slider window. The trajectory information is predicted using the method proposed in~\\cite{wittenbrink_learning,gilsen_multi}. For each trajectory, the compliance cost obtained by predicting different portions of it are shown, respectively in Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2}. Figure~\\ref{comparison_whole_timeline_configuration_1} represents the case of predicting the whole trajectory as shown in Figure~\\ref{comparison_whole_timeline_t_1}, while Figure~\\ref{comparison_whole_timeline_configuration_2} represents the case of predicting only the temporal information of the trajectory. In both figures, the allowed compliance costs resulting from predicting different temporal information of the trajectory are obtained using the method proposed in~\\cite{wittenbrink_learning,gilsen_multi}. The allowed cost,"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the whole interval~$[0,T]$ with a fixed interval size~$T_k = \\frac{T}{k}$. Although the a posteriori errors~$e_k$ decrease toward the error bound~$ \\leq \\sqrt{\\frac{c_0}{m} \\frac{T}{k}}$ expected at later iterations, the fluid-neer residuals~$\\frac{e_k}{A_k} - A_k^{-1} \\bm{f}_k$ on the uniform meshes do not achieve the correction rate~$\\frac{c_0}{m} \\frac{T}{k}$. The maximal error~$\\frac{e_T}{A_T} - A_T^{-1} \\bm{f}_T$ still converges to a limit smaller than the error bound~$\\sqrt{\\frac{c_0}{m} \\frac{T}{k}}$, but this limit is only achieved in Table~\\ref{fluid_residuals_uniform_equal} at the final iteration~$e_{T-1} = 0$ on the last iteration~$k ="
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In the fluid subproblem \\ref{eq:oe_dt_fluid}, $\\alpha$ is the coefficient of the semi-implicit procedure, which is closely related to the preconditioning by the diffusion matrix $D$ in \\ref{eq:D}. For a preconditioned linear system $A\\xb=\\yc$ advantageous by a bound in the number of subiterations maximum $max\\_sub=10$ of a block-Gauss-Seidel procedure is $\\alpha=\\frac{\\yc}{2D\\xb}$, because $A\\xb=\\yc$ then has as many subiterations maximum $max\\_sub$ necessary to accuracy $e=\\yc/\\yb$. For large values of $D$ and $\\yc$ this can be quite efficient since $2D$ has a major effect, and $\\xb$ only needs to be consistent towards the solution $ \\xb^{\\star}$ according to \\ref{eq:D}. We therefore propose the choice $\\alpha=\\frac{\\yc}{2D\\xb}+\\frac{0.5}{max\\_sub}$. A further increase of the accuracy would"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "1\\% down to $1.6\\%$ in Fig.~\\ref{fig:scanning_images} B and C, respectively. One possible explanation for this result is that the scanning frames help the mobile stations to stay close to their positions and therefore, the arriving waves coming from the close-by vehicles cancel or reduce any error that might be learned from the observations. The results of this paper can be extended to other frequency bands and RAEs. The results so far show that even if the bandwidth of RAEs is small, the effect of scanning frames help to correct errors in RF placement, which in turn, reduces the errors in RAF assignment. Thus, the results of this paper can be extended to other frequency bands and RAEs and be seen as a single framework to manage and improve wireless networks at different RF bands and RF placement using scanning frames. There are several open problems that can be investigated in this paper. The effects of the drop-in users\u2019 traffic or, new users joining the network, new vehicles arriving to the network, and changing traffic distribution after a fixed time period in the RF placement"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae these authors} the magnetic field structure inside and outside the CME region was analyzed and it was shown that the large and remote-location coronal hole from which the CME emerged was active and susceptible to CMEs for at least a couple of months in advance. The authors concluded that the small-scale interconnected magnetic flux tubes through which the CMEs emerged must still exist and influence the evolution of the CME until its orbit becomes unwound completely. Strong statistical arguments were presented by \\citet{Gopalswamy:2018mrs} who showed that the subsequent CRs/particles commonly associated with large CMEs have typical speeds of $\\sim$300~km~s$^{-1}$, much slower than the CME speed. They concluded that the subsequent SEP events are physically related to the preceding CME and suggest a two-phase evolution of such events, initialized when the CME is released from the coronal source and subsequently evolves into the CRs/solar wind wave-particle interaction scenario with a significant enhancement of CRs/particles emssages. In our opinion, the strong anisotropy of CRs/"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014, MAG2012, Reimann2012, Ofman2015}), so does the Cauchy condition, which --- when imposed to the coefficients of a system of polynomial PDEs enunciated in \\eqref{system} --- ensures that no derivative vanishes on the boundary of the domain $\\Omega$. This condition can be imposed to the potential $L$ via the ```no-boundary \"' condition $x_d = 0$ for all $x \\in \\partial \\mathbb{D}$. This, of course, would render certain coefficients, e.g. $\\widetilde{W} \\and \\widetilde{N}$, trivial. However, in the case of the coupling coefficient $\\widetilde{L}$, it was observed (via degree reduction) that the number of polynomial equations (independent of the $x_d$ term) scales linearly with the number of degree $d$ points, i.e., $L(\\mathbf{x}) = 0 \\has \\has \\has \\has \\has \\has \\has \\has \\"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the $f(\\bm{x}(k))$ and the $\\bm{x}(l)(k)$, where the second is easily obtained by applying the iteration to $\\bm{x}(k)$ instead of $\\bm{w}(k)$. For the Landweber iteration $\\bm{x}(k+1)= \\bm{x}(k) - \\alpha A^{-1} \\bm{w}(k)$, the second estimate always becomes a noisy artifact if the size of the problem is large. The noise, however, often stays in the matrix $A$ and hence does not affect the subsequent iterations, which makes the multiple iterations of the Landweber iteration a somewhat stable algorithm. This is different for the Newton-GMRES iteration since its GMRES subproblem is sensitive to the artifact matrix $B$. However, for the fictitious minimum corresponding to the noise matrix $A$, we assume that its abnormal values are the same size as the noise in the regular coefficient matrix. Therefore, the minimum corresponding to the noise matrix $A$ also looks like a noise without any artifact minimum in function value between the"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of each individual vector $b_{i,k} \\rightharpoonown \\zero$ while the error of the aggregate signal \\(y_{k} \\approx \\tilde{y}_{k}\\) remains arbitrary. The local character of this convergence may be surprising, especially when it is paired with the semi-global convergence in the linear case. The underlying reason for this phenomenon is the inclusion of a global synchronization term, $\\mathcal{L} (y_k) = \\mathcal{U} (b_k) + \\mathcal{R} (y_k)$, in the nonlinear case. The global synchronization term $\\mathcal{U} (b_k)$ aims to address the problem ofalleviating the signal loss caused by the global error $E(y) = \\frac{1}{M} \\|y-\\tilde{y} \\|_2$. However, this feature also comes with some drawbacks. First, the global synchronization can in general be very difficult to use when the control system (1.1) is nonlinear since most of the nonlinear function $f()$ are not  globally continuous."
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the diffusion process $\\mathcal{W}_j$ on $\\Omega$ is defined via the diffusion coefficient $D_j \\in \\mathbb{R}_{++}$ as $ \\mathcal{W}_j := D_j \\omega_j$, $ \\omega_j \\in C^2(\\R^N_{\\mathbb{R}})$,$ \\mathcal{W}_j$ is defined above depending on the values of $D_1$ and $D_2$. Our goal is to derive a useful system of nonlinear integral equations of special types for estimating the constants $C_1$ and $C_2$ in (2.1). For this, we first introduce an auxiliary function $G_\\nu (\\mathcal{W})$ that has a explicit formula as given in \\eqref{equation:GN}, which is closely related to $\\mu_{1,d_\\nu}$. On the basis of \\eqref{equation:GN}, we derive in \\cite[Lemma~\\ref{lemma: induced GF}]{2022ArXiv:2007.11894}"
        }
      ]
    },
    {
      "train_loss": 0.0484931640625,
      "update_counter": 2250,
      "epoch": 72.54838709677419,
      "val_loss": 6.01171875,
      "val_entropy": 0.72607421875,
      "val_ttr": 0.6090494791666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the change of the peak height in the signal power spectrum for the configuration 1 and 2 as a function of time for different durations (or snapshots) as the time-series analysis result. Here we consider the detection performance by changing the number of subwindows $N_{sub}$ in the processing. In this study, we consider three sizes of each subwindow as $N_{sub} = 100$, $200$, and $300$ and the results are shown in Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2}. For each size of $N_{sub}$, the results are obtained by considering the maximum likelihood estimation and median estimation. The results are considered as the perfect knowledge of the ship's existence and its location on the ship. The maximum variance of the peak height is 0.4\\% in this case, and the median variance is approximately 0.6\\%~. In this case, we can see that the peak height change slightly for different durations and different subwindows sizes"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $1\\times 10^5, 1\\times 10^6$, and $1\\times 10^7$ in dimension $n=1$, $n=2$, and $n=3$. Results for the $L_2$ estimator on the regular meshes are taken from Table~\\ref{first_results_uniform_equal} in~\\cite{RENO:2012}. For the $L_2$ estimator we observe an improvement in accuracy compared to the a priori case. For dimensionality $n=3$ in Table~\\ref{fluid_residuals_uniform_equal} we observe a decrease in the error by a factor of about $2.2$ as compared to a factor of about $2.3$-2.4$ for the $L_2$ estimator. We also observe that in all cases the residuals on mesh $I$ are larger than on mesh $II$ (except for the $L_2$ error on grid $3$ where the reverse is true). This can be explained by"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy will be those of the dissolved chemical independent from the time-step of the solid and fluid simulations, and dependent on the time-step of the chemical simulations. Such an approach will require an exchange of data between the sub-models, which \\cite{CompareMethods} showed that, in case of an average coordination of particles larger than two, a discrete particle model should be used. The computational cost will be the solid and fluid simulations, time-integrated for a few time steps, balanced against the time it takes for $10^6-10^7$ measurements to be made in the database, each taking one CPU millisecond. The chemical equations can be run in a parallel fashion, with each equation running on a separate CPU, with each atom's state independent, and the total run time proportional to the dissolved concentration. The dissolved concentration will be of order of $10-100/t_k$, where $t_k$ is the time-step of chemical simulations in seconds, where higher values providing faster results. Such an approach will be appealing for"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "13\\% down to 3\\% in Fig.~\\ref{fig:scanning_images_2}, when the scanning method is combined with images. A detailed discussion of the effects of the learning with views and learning with frames is laid out in \\cite{BEST2017}. In this paper, we observe that the distinctive differences in the images between floors are harder to learn from, e.-4 to e-5, than the differences within a floor, on the level of errors between  e-5 and e-4. Due to this, it is beneficial to use a combination of scanning frames and learning, as shown in Fig.~\\ref{fig:scanning_images_2}. Kami\u0144ski et al. also discuss a harder to perceive phenomenon, called ``spatial shift'' in \\cite{KAMIENSKI2015}. They describe how the algorithm learns the wrong path dependencies, and puts sensors herniated by the wrong pipes, in the wrong location, and in the right location. This effect is prevalent on higher floors, where images give a"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, the heuristic reasoning is put forward to explain why rotating CMEs rarely exist: due to spatial and temporal effects, hot material is ejected in front of the body, while the main structure stays behind. However, this does not explain why a rotating CME does not exist  in a universal sample of 36 events observed by {\\emph{STEREO}}-A and {\\emph{STEREO}}-A during May 2013. As shown in Fig.~\\ref{fig:comertorus}, the rotation angle of CMEs changes significantly depending on the European (blue) and the American (red) coordinates; in other words, the rotational parameters observed from different S/C\u2019s observe different kinematics for the same CME. According to the authors, such a difference is caused by the uncertainty of the acceleration and acceleration termination of a CME, depending on the ability of the S/C to observe lower coronal and lower equatorial eruptions regions. In this version, the heuristic explanation becomes factual; due to the uncertainty of the acceleration and acceleration termination of a CME, a person watching the event from different S"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2013,Ge2013,Jonsson2014,Ofman2014}), most important for this work its application in quantum field theory, more precisely in the study of the nonlocal correlation functions. What can be shown is the so called double-regionalization formula for the nonlocal two-point functions of the field variables in quantum field theories. While the regionalization formula tells us how to compute local correlation functions from the regional times, the nonlocal formula can be derived by regionalization using the tangential cone condition \\eqref{tangentialconecondition}. This is shown in the works by \\cite{Ge2013} in quantum circle mechanics and by \\cite{Jonsson2014} for quantum field theory. Besides its application to quantum field theory, circular time analysis (e.g. by means of the regionalization formula) is a novel technique to understand the dynamics of dynamical systems \\cite{Ofman2014,Hochsler2016,Hochsler2017,Ries"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradients. The Landweber iteration does not require this since its iteration is so simple. However, a recent development \\cite{Qiu2017} shows that although the AIC does not have gradient, it has a effective search direction for the numerical iteration.  This direction changes step by step due to numerical errors. This turns out to generate a sequence of local solutions that approach the global minimum in a way that show the characteristic of a local solution. However, there is a problem. Because this sequence of solution is generated by the regular forward difference for numerical gradient and  local interpolation for the effective minimum parameters, it will always generate a sequence of solution that first is local noisy solution then global solution. Such like problem has been discussed in \\cite{Edelman1998} for molecular chemistry calculations and computational biochemistry. Such a problem is also relevant for the minimum solution of AIC. This mistake can be easily prevented by simply using a proper gradient method. The numerical proof is shown in  Figure \\ref{fig:local_minimum_error_solution}. It shows the"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the entire sequence $k\\)-th layer weight matrix $W_{k,j}$ or $W_{k,l}$ in a fixed column or row, respectively, and we also have the convergence of convergent submatrices of $W_{k,j}$ or $W_{k,l}$. The convergence of the actual matrix elements of $W_{k,j}$ or $W_{k,l}$ is local in the sense that we can lose convergence consistency of early updated matrix elements and subsequent newer ones. For example, consider the nonlinear layer $C\\in {\\bf C}_{k,c}$ which starts not to be normalized, and hence, neither is $C^{k}$ in the end. Under the general convergence theory in this paper, it indicates that even though the first few layers of $C^{k}$ may not be normalized, once $C^{k}$ is passed to later layers, it will be normalized due to the sublinear convergence in Theorem \\ref{thm:linear_nonlinear} and then $C^{k+1}$ will be normalized. This indicates that except"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the local roughness of the solutions. We let $F:\\Omega \\rightarrow \\R^d$, $d=2,3$, be the solution to an ill-posed partial differential equation (PDE), which is for instance caused by a faulty data $f$ and an inappropriate smoothing of $f$ in $G$. A representation of the PDEs we use to generate our simulations is omitted for brevity, but more information can be found in other papers within this special issue of PMR. The underlying data $f$, which we assume to be random, is connected to an error in the solution $F$, because it yields solutions $G$ with a poor accuracy $A$, which is measured in terms of $L_2$ error. We assume that $F$ is given by $f = X_{D}$, where $X_{D}$ is a data noise, $D$ is a typically small random variable, and $X_{D}$ is sufficient to generate an error $A$. The data noise $X_{D}$ is generated by our data generating function (DGF}, $G"
        }
      ]
    },
    {
      "train_loss": 0.0512900390625,
      "update_counter": 2500,
      "epoch": 80.61290322580645,
      "val_loss": 5.822265625,
      "val_entropy": 0.7412109375,
      "val_ttr": 0.5804036458333334,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of our theoretical predictions for the perfect works $W_{1}$ and $W_{2}$ for the first configuration and second configuration, respectively. The theoretical results for our multiple-choice algorithm are given in Figure~\\ref{comparison_whole_timeline_configuration_1} (green circles), for the optimized algorithm $W^{(OP)}_{1}$ in Figure~\\ref{comparison_whole_timeline_configuration_1} (red squares) and in Figure~\\ref{comparison_whole_timeline_configuration_2} (red squares), for the optimized algorithm $W^{(OP)}_{2}$ in Figure~\\ref{comparison_whole_timeline_configuration_2} (blue triangles). In both figures, the theoretical predictions of the optimal algorithm $W^{(OPT)}_{1}$ is shown in Figure~\\ref{comparison_whole_timeline_configuration_1} (blue diamonds) and $W^{(OPT)}_{2}$ in Figure~\\ref{comparison_whole_timeline_configuration_2} (orange crosses). Moreover, the timeline of $W_{1}$ is shown in Figure~\\ref{"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " $K_0 < K_1 < \\cdots < K_n$ with node coordinates $\\mathbf{x}_i = (i\\Delta t \\mathbf{e}_j) + \\mathbf{r}$ and $\\mathbf{e}_j$ is the standard basis vector. The outer iteration is applied directly to these refined meshes. The inner iteration is applied to a mesh generated by a simple interpolation on the nodes with an accuracy $\\sim \\Delta t$. It is clear that the estimator becomes perfectly accurate on $K_0$ if $\\Delta t$ is sufficiently small, i.e.~$n \\Delta t \\to 0$. We set $\\mathbf{r} = 0$ in Table~\\ref{fluid_residuals_uniform_equal} in order to highlight the effect of $\\Delta t$ on the estimator. From Table~\\ref{fluid_residuals_uniform_equal} one can observe the following properties: (i) the accuracy of the a posteriori estimator on the equidistribution meshes is independent of $n$ and it is independent of $\\mathbf{"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method depends on the solid subproblem, because the convergence of the iterative solution of the fluid pressure by use of a finite element method in the fluid subproblem is of order $O(k^2)$ for $k$ changes in the solid domain. We point out that a similar strategy has been used in other contexts \\cite{Verseghy2014,Verseghy2016}, however, in these contexts the object of integration is a simple source term, rather than the solution of a boundary value problem. In contrast, the overall accuracy of a concrete embankment structure depends strongly on the accuracy of the subproblem for the fluid-structure interaction. Appropriate formulations for boundary value problems related to problems in hydraulics are available in the vast literature on that topic. \\cite{LSIC1998,CULP2003,CULP2006} provide a start for problems with single or multiple open boundaries, while problems with imposed solid boundaries can be solved, e.g., from \\"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "16\\% down to about $10\\%$ in the last half of the measurements due probably to increased correlations between the scanning frames of different floors. The results in the $3.4$ GHz plot are similar, although the decrease in the number of floor errors is even more pronounced with the drop down to about $35\\%$ at the middle of the measurements, and finally decreasing down to about $15\\%$ up to the end of the Fig. \\ref{fig:scans} for the weakest link (f$_{c} = 2.4$ GHz). Again, the decrease in the floor error without the use of scanning frames is not consistent but can be observed in the decrease of the floor error from about $70\\%$ down to about $55\\%$ in the middle of the measurements. While the results presented in the above figures are for the link Antenna 1, similar results can be deduced for the other links and antennas. Moreover, increasing the sampling rate from $150$ Hz to $300$ Hz (not shown here) resulted in reduced correlation between"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, the authors concluded that on a longer-term scale, the equilibrium in a magnetic structure before the development of  a shock feature must be such that the magnetic field becomes almost parallel to the bulk velocity field, induced by the expanded perturbation in the solar interior. This version of the equilibrium also exists in the case of a simple Alfv\\'en wave, and therefore is true both for the wave and in the expanded magnetic structure formed by the expansion. In fact, at what point this version of the equilibrium holds true is determined by the background temperature field, and by how much the magnetic field becomes parallel to the velocity field. According to \\citet{2015ApJ...802...17M}, the background temperature field becomes insignificant on the scales of interest. In that case, the magnetic field becomes completely deflected by the background field, moving to larger longitude, and becoming negative. This version of the equilibrium seems to exclude the observed CME. A common feature in observations of such structures is that the maximum deflection of the magnetic field is obtained at the interface between the high-density, low"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Milan2015}),  the minimization of (\\ref{eq:elasticmodel}) is not necessarily easy.  The standard methods of gradient methods or the more recent NCP methods \\cite{Meso2010} do not directly apply because the functional (\\ref{eq:elasticmodel}) depends also to the boundary.Moreover the gradient methods may be not consistent when they are applied because the solution of the tangential cone condition \\eqref{tangentialconecondition} has to be stored and concoted when the solution is divided in pieces by the gradient method.  As a consequence it may lead the method far from the local solution.  When the solution is not trivial, the gradient methods apply badly and the minimum energy is not reached. This problem existed also for the classical spherical model, where the functional to minimize is straight, and it was studied in \\cite{Onoda2011}.  In \\cite{Onoda2011,Onoda2013} several schemes of gradient methods have been extended to the spherical model and"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradient. The gradient estimates obtained by the Landweber iteration have see no difficulty in being accurate enough for practical application of the heuristic. In fact, the specific iteration is designed to avoid this concern. The specific iteration does the main computation in the while loop, which is repeated indefinitely. It thus may be noisy or incorrect in the early iterations. For the gradient estimation, we see that the error is mainly caused by the initial value for the gradient. In Chapter~\\ref{chap:theory}, we showed that the value for the gradient at the local minimum $x^{(m)}$ depends on the starting point, i.e., in general $g_{k}(x)$ is not a fixed function of $x$. Thus for some starting points, the iterative procedure is not even guaranteed to converge to $x^{(l)}$. However, we have observed that many starting points have the\u5e78\u8fd0\u5730 that $x^{(l)}$ and $x^{(m)}$ lie very close to each other. This is because the potential function is minimized by $x^{(l)}$ only\u5e78\u8fd0\u5730 a small range of values of the parameters"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of random vectors, although their gradient may be incorrect. The local convergence theory in the linear case was stated in Substep 2.2 as the convergence of random gradients. We point out that there are other indicators to characterize a function sequence, e.g., convergence in sublevel set sense and in ball sense. In addition, we do not move the underlying approximation space in the conventional gradient descent. Therefore, the establishment of the stationary value and convergence rate in the nonlinear case still retains the local character. \\cite{nashine2020against} shows that NADE can sense small gradients in the overall loss and thus converges to the solution even when the conventional gradient descent may not converge. NADE's convergence is close to the optimal solution's convergence and it converges faster than the classical gradient descent. \\cite{nashine2021naDE} further shows that the optimal solution of GD is not the global minimizer of the loss and the sequence of its gradient converges to zero instead. We consider the solution that can approximate the optimal solution and has the minimum convergence rate over the"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the departure from strict convexity of $A_j:\\R^p\\to \\R^q$, $j=1,2$, of the optimal control input. According to our assumption \\ref{assumption:convexity} on the functions $g_j$, the domain $A_j(\\xb)$ is either strictly convex if $g_j(\\xb)\\equiv 0$ has the form $\\xb$, $\\and \\xb$ and $g_j(\\xb)\\equiv \\xb$, where $\\xb$ and $g_j(\\xb)$ are smooth with different smoothness, or it is convex if $g_j(\\xb)\\equiv \\infty$ has the form $\\and \\xb$ and $g_j(\\xb)$ is smooth with the same smoothness \\cite{ITK2000}. For the sake of clarity, we note that the domain $A_j(\\xb)$ is of the form $\\and \\xb$ and $g_j(\\xb)$ if $\\and \\xb$ and $g_j(\\xb)$ are smooth with different smoothness, and"
        }
      ]
    },
    {
      "train_loss": 0.0613642578125,
      "update_counter": 2750,
      "epoch": 88.6774193548387,
      "val_loss": 6.103515625,
      "val_entropy": 0.70458984375,
      "val_ttr": 0.5930989583333333,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 12) (27, 12) (28, 12) (29, 12) (30, 12) (31, 12) (32, 12) (33, 12"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the performances of our two configuration i.i.d. entry-exit and residual training versus the configuration without using the timeline information. The results are quite surprising, in the first configuration, which does not make use of the timeline information, the performance of residual training is slightly higher on some windows (figure~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} at stride of 2 and 4) and lower on some windows (figure~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} at sliding window of 3 and 10). In addition, both configurations perform worse when the sliding window is large (figure~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} at sliding window of 10). On the other hand, when using the timeline information, the performance of residual training is always higher than in the without using the timeline information case ("
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " that approximate the original one by factor $T/{T_I} = 20$ on the rectangle $[0,2\\pi]^2$ with $N_I = 100$ and $N_F = 300$. In all cases the weight function $\\omega$ and the function $g^i$ are defined as in the uniform mesh, but now the control parameter $\\theta$ takes the value $0$. Thus, the expected speed of the refinement process goes by the factor $20^2 = 400$. For the fluid problem we set $\\alpha_i = \\beta_i = 0.05$, \\textit{e} := 2, and $u^* = 1.0$. In Table~\\ref{fluid_residuals_uniform_very_unequal} we present results for the a posteriori error estimator on time meshes with the same shape as in Table~\\ref{fluid_residuals_uniform_equal}, but with the distance between the nodes on the boundary of  $10$, that approximates"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method depends on the solid subproblem and the fluid subproblem, and the expected expectation for the end result depends on both, the initial data and the time limit. In order to solve small initiation problems, initial data with small initial deviations from the steady state is suggested. For linear problems, the overall tolerance for the solution for satisfactory results is $\\lambda = \\frac{\\|\\mathbf{b}_0\\|}{\\|\\mathbf{b}_1\\|} < 10^{-4}$, where the initial border conditions $\\mathbf{b}_0$ and $\\mathbf{b}_1$ are provided by the user and define the boundary conditions for the solid and the fluid problems, respectively. This overall convergence tolerance is divided into approximately $10$ steps over $0.25$ years of simulation time, and the initial vector $\\mathbf{b}_0$ is defined as follows: in the linear problem, $J \\mathbf{b}_0 = (\\lambda \\mathbf{e}_1)$, with $\\lambda$ as defined above, and $\\mathbf{e}_1$ is"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to about $1.2\\%$ in the last \\(80\\) frames shown in Fig.~\\ref{fig:scanning_floor_2}. The reduction in floor errors cannot be explained by a decrease in the floor shift. Thus, a frame sequentially showing \\(80\\times 2^{n} + 1\\) scans, from level \\(n=1\\) to \\(n=80\\), was also found to reduce the floor shift and also the shift compared to frames showing \\(80\\times 2^{n}\\). However, there was no much reduction in the floor errors by increasing the number of scans to \\(80\\times 2^{n} + 2\\) or \\(80\\times 2^{n} + 3\\) as shown in Fig.~\\ref{fig:scanning_floor_3}. It can be attributed to the fact that a few antennas on different floors may perform well mechanically or electronically when the user is moving but perform poorly when the vehicle is stationary due to reflection or transmission losses  as shown in Fig.~\\ref{fig"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "rfa} it is shown that the multi-ring model should be corrected by a non-zero tilt of the inner core layer in comparison with the outer edge. In a realistic model the change of the bending radius due to the extending of the magnetic field\u6280\u672f away from the center field streams does not follow a linear trend hopping into a transition into another category of change called the Stewartson effect \\citep{2010ApJ...722..963L}. The CME center extends further outward with increasing bending radius as observed by \\citet{Edrington:2017mdb} on CME launch and the first CME approach \\citep{2020ApJ...893...25Y}. This is explained by enhanced diffusion outwards which offsets magnetic fields\u6280\u672f beyond the transition level. Additionally, fields within a magnetic field technology field source are concentrated\u6280\u672f into a narrower volume compared to the corresponding flux rope extending beyond the technology boundaries. This concentrates the electric field technology and increases the\u6280\u672f speeds of electron escape out of the technology. These factors all contribute to an increase in the estimated speed"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Magel2018}),  the optimization procedure in \\eqref{objectisp}, while straightforward, does not offer a direct way to evaluate the objective function at the current solution and therefore it is important to look for a fast way to evaluate the function. This is especially important in real applications, where a user might want to dynamically track a target from motion, where the current procedure would need to recompute the object state. Further, it is also important to note that a part of the objective is \\textit{on the way}, that is, it is being computed while moving from one sensor to another, and therefore a fast computation of the objective is necessary. For such purposes, we note that inserted in \\eqref{objectisp} is the pair of optimization problems \\eqref{localproblem}, which can be solved quickly right at the location where the optimal projection is found, and this looks like it can be quickly solved by the GPS unit on a smart device. This allows for very quick computation of a target location on a regular basis, and further, the user can verify"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradient. The gradient estimates obtained by the GKB scheme are much more efficient (in terms of execution) that the ones of other established schemes. In our initial design, we used the Landweber iteration \\cite{DM11}, which is based on the gradient estimate provided by the GKB scheme, and we experienced a dramatic improvement of the efficency compared to other popular navigators. However, we recently noticed a strange behavior during the learning of our first navigator (based on single particle evolution) at the temperatures relevant for heavy element nucleosynthesis. The navigator started to produce sequences with unusually high hydrogen concentrations, which were clearly incompatible with solar system predictions.  To understand this strange behavior, we carefully tested the performance of the Landweber iteration. We noticed that as the iteration grows, the iteration vector tends to  a specific direction, which  is not the direction of the gradient. This happens around the region of the space, which is the actual goal, which is not contained in the initial condition. In Fig.~\\ref{fig:local_first}, the connection between the iteration and"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " consistency of each $x_{k+1}$ to some $x$, where $x$ is an aggregate of unknown \\textit{hidden} unknowns in the sense that $x$ may contain an unknown sub-sequence of $x_k$ as well as an unknown sub-sequence of $x_{k+1}$.  Without a joint consistency estimate across all dimensions, it is not possible to judge whether the sequence $\\{x_k\\}$ and $\\{x\\}$ are identical. The simplicity of this property can already bring great benefits to users of this algorithm. For example, one can simply ignore any further derivation of $x$ after the iteration $k$ is completed, and collect the latest versions of $x_k$ and $x_{k+1}$ to compute the consistency estimate $(\\rho(x,x_k), \\ell(x,x_k))$, and then choose the update direction for the next iteration $k+1$ by (17), (18) or (19). One can then validate whether $x_{k+1}$ is already a good approximation"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the surface $p$ near normal direction $\\mathbf{n}_j$, with $\\mathbf{n}_j \\in \\mathbb{S}^{N-1}$ and $\\lim_{j\\rightarrow \\infty} |n_j| = \\infty$. We use a numerical method to solve the stochastic Poisson-Neumann problem $\\Delta U + Au + \\sqrt{q} \\overset{\\mathbb{I}}{\\longrightarrow} 0$ in $\\Omega$, with $U = \\sideset{}{_{S}}{\\sum} u_j \\mathbf{n}_j$, $q_j = g_j \\raise{\\sup}_{\\Omega}{\\sum} \\left( \\mathbf{n}_j \\cdot \\mathbf{n}_l \\right) \\between{U}$ (where $\\Between{U}$ denotes the characteristic function on $\\Omega$ and $\\sup{\\sum} = \\sup_{k \\in {\\mathbb{Z}}^{n}} $ if needed), and $A U = \\sideset{}{_{S}}{\\sum} a_j u_j \\mathbf{n}_"
        }
      ]
    },
    {
      "train_loss": 0.0363447265625,
      "update_counter": 3000,
      "epoch": 96.74193548387096,
      "val_loss": 6.412109375,
      "val_entropy": 0.6611328125,
      "val_ttr": 0.5787760416666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 12) (27, 12) (28, 12) (29, 12) (30, 12) (31, 12) (32, 12) (33, 12"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the performance of our full timeline solution (Fig.~\\ref{comparison_whole_timeline_configuration_1}) and the reduced solution (Fig.~\\ref{comparison_whole_timeline_configuration_2}) for different event rates. We find that the solution with our event ordering and event aggregation performance is approximately 1.5-2x better than the solution without event aggregation. In the case of our full event timeline solution (Fig.~\\ref{comparison_whole_timeline_configuration_1}), as the number of events increases, the solution of ET2 is performing approximately 1.5-2x faster than our solution. The solution of ET1 is performing approximately 1k-1.5k times faster than the corresponding solution of ET2. In both cases, what happens is that the event volumes is increasing, but the caches is not changing. Therefore, the solution of ET2 is performing significantly under performance speculation, due to the variable time for getting an event from the cache into the processor core. What is even more noticeable is that the solution of ET1 has a large performance"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the entire interval~$]0=0,T[--\", where the value of $T$ decreases linearly from an initial value $T_0$ such that $\\eta_l(T) = 1$, i.e., the first element of the uniform grid. As expected, the estimators smooth out the step function nature of the initial error and become approximately linear in $T$. (We point out, in comparison with the previous section, that the error on a linear grid does not smooth out for finite interval, but develops a step at the switching point.) For the simplest model ($L=3$, $t=1$), the a posteriori estimators are very inaccurate in general, but reasonably accurate for sufficiently large $T$. As the time converges to zero, we expect non-linear error $\\approx 10^{-3:4}$, while the estimators predict it to be around $\\approx 10^{-6:7}$. For more refinement ($L=100$, $t=1$), the estimators are very inaccurate in general, but reasonably accurate for sufficiently large $T$. In part"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method will again be governed by the local in time and in space interpolation steps for both phases \\cite{Hirschfeld2008}. Importantly, in order to allow for a clean design of the numerical method, we propose to {\\it}assume {\\it}  on the solid phase the following two limitations compared to the single fluid problem. First, we {\\it}assume {\\it}  that {\\it}  the shape $k$ of the particle follows the linear trend in the last $2 \\cdot p-1$ steps, i.e., high-order particle geometries can be considered. Second, {\\it}we {\\it}assume {\\it}  that {\\it}  the deterministic value $k_{\\rm last}$ clearly differs from its statistical mean by at most $\\lambda_k$ in the last $2 \\cdot p-1$ time steps, where $\\lambda_k$ is the global particle bias in the $k$-phase. The following property will be important, the statistical particle bias $\\lambda_k$ tends to be smaller"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to about $1.2\\%$ in the last \\(60\\) frames shown in Fig.~\\ref{fig:scanning_floor_2}. The reduction in floor errors seems to peak at \\(C$=\\{$2500$\\} and then decreases again after \\(C=4000\\). The number of scanning frames seems to optimize at \\(C=14000\\mhz\\), and further increase of \\(C\\) does not seem to benefit the results any more. However, it is noticeable that the reduction of floor errors is very significant after \\(C=10000\\mhz\\), which could be a frequency peak effect. We can see in Fig.~\\ref{fig:scanning_floor_2} that the scanning frames with the lowest floor error are those obtained with a every other transmit antenna, i.e., $i\\pm2$ where $i$ is the antenna index of the frame. It could be noted that the error is reduced when the RF signals from each BS are combined into a single measurement, i.e"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, the null slalom configuration was considered and it was shown that the dominant factor in determining the trajectory of a CME is the reconnection-driven jet it is embedded in, and not the initial structure of the eruption. Here, we consider the CME to be initiated by a small, partial solar eruption known as a small isolated coronal hole (ICH). This region has low corona density and short lifetime compared to other escapees such as helmet streamers and exit channels. The IHC is likely to have formed as a partial coronal hole \\citep{2007SoPh..253..337L}, which was subsequently driven by interchange reconnection \\citep{2008ApJ...675L.221Z} with the formation of a predominantly outwardly propagating jet. For a detailed analysis of jet formation and evolution, see \\citet{Zhe:2012p4366}. Our consideration of a later phase of the CME, as it propagates at least 0.3~AU (as seen in Fig.~\\ref{fig:zac"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Magel2018}),  the optimization procedure in \\eqref{objectisp}, while conceptually simple, is for us at least a somewhat conceptual puzzle. While there are many ways to formulate the \\eqref{objectisp}, we are particularly intrigued by the optimality criteria in \\cite{BremsPatternMatch2015}, where the authors use a similar formulation to we do at the end of Section~\\ref{patternmatchstability} to address and solve an open problem in pattern match-ing by these authors and Y. Chizat. And in this paper they use a similar formulation to establish a novelest inequality over the optimal vector $V$ in \\eqref{objectisp}. And in this paper they use a similar formulation to establish a novel inequality over the optimal matrix $C$ in \\eqref{objectisp}. And in this paper they use a similar formulation to establish a novel inequality over the optimal matrix $D$ in \\eqref{objectisp}. And in this paper they use a similar formulation to establish a novel inequality over the optimal matrix $D$"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradient. The gradient estimates obtained by the GKB scheme are more accurate than the ones obtained by the LKB scheme for two important exceptions. The first one is the first local solution provided by the Landweber iteration, which requires no gradient estimate, and the second one is the spurious solution obtained at the threshold, which also requires no gradient estimate. For the latter, we discussed in Section~\\ref{sec:test} that the difference between the true gradient and the Estimatred Gradient can be very large, thus increasing the error for the convergence estimate of the Landweber iteration. For the highest convergence rate, a pre-smoothing step could be adopted to the Landweber iteration \\cite{3,11}. However, this approach would significantly slow down the convergence rate. Therefore, we choose another approach. We directly modify the Stegemann iteration \\cite{8}, which was adopted in \\cite{11}, by replacing the LKB gradient with the GKB one, and we call it \\emph{Modified Stegemann iteration}. We explain this modified iteration in Appendix~\\ref"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " consistency of the federal node controller under a initial neighborhood of the control graph $P(K, M, d)$ in which all nodes are included.  In contrast, the convergence theory in the linear case is a global one and it is not required that whether a controller can be reached from $K$ to control a node at any time.  We prove that local convergence is indeed equivalent to global convergence and the trajectory under consideration is only needed to have access to the global control graph $P(K, M, d)$ and no other knowledge is required.  In the local convergence theory, we also utilize the stability of the underlying road network obtained by applying the federal controller $K$ to the network.  This stability analysis captures the road network stability and can be easily applied to examine the stability of the road network without requiring advanced knowledge of the network.  Our paper provides a control convergence theory for nonlinear federated control of a class of road networks in which the state dynamics are known and the control graph contains control-constrained roads.  This covers various road networks of interest and in particular includes both linear and"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the fluid circulation. The $N-2$ dimensional boundary $\\partial\\Omega$ is equipped with a characteristic shape $\\gamma_\\Omega$, and $\\gamma_\\Omega$ and $\\partial\\Omega$ are assumed to be smooth. Furthermore, we assume that $\\gamma_\\Omega$ is a function of the boundary conditions, and is considered as a part of the domain. On $\\partial\\Omega$, $L$ is a smooth smooth smooth link, and $\\nu_j$ is the characteristic shape in the $j$-th direction, depending on $L$. Thus, the boundary conditions are gasketed. The fluid $\\Sigma\\in\\mathcal{W}^2_0$ is a water flow determined by a force $F$. When the fluid level $x_1=0$ is fixed, the $N-1$ dimensional fluid movement is gasketed and the $N-1$ dimensional fluid movement is gasketed. When the fluid level $x_1>0$ is fixed, the gasketed water flow is gasketed, and the gasketed water flow is"
        }
      ]
    },
    {
      "train_loss": 0.0345810546875,
      "update_counter": 3250,
      "epoch": 104.80645161290323,
      "val_loss": 6.544921875,
      "val_entropy": 0.660888671875,
      "val_ttr": 0.625,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 12) (22, 12) (23, 12) (24, 12) (25, 12) (26, 12) (27, 12) (28, 12) (29, 12) (30, 12) (31, 12) (32, 12) (33, 12"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the performance of our full timeline solution (Left) and the sliding window solution (Right) under two configuration names \\textit{Data\\_Aware} and \\textit{Cache Friendly} respectively. We find that our data-aware algorithm is much faster than the sliding window algorithm in all configurations. In configuration \\textit{LowData}, it is more than 4x faster in the number of sequences per window and in the time required to finish the window processing. In higher data configurations, it is even faster such that it is about 10x in \\textit{DataAware2} and \\textit{DataAware3} configurations. In comparison, in configuration \\textit{CacheFriendly} we observe that the performance difference between our algorithm and the sliding window algorithm is lower in the higher memory configurations and is still significant in the lower memory configurations. However, in the last two plots, it becomes obvious that our algorithm performs much better even in the hightest memory configuration such that it is about 3x in \\textit{DataFriendly2} configuration and about 64x in \\textit"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $2\\times 2$ square. In all cases $k=th$ where $th$ denotes the thresholding parameter and $k=3$ for the thresholding parameter $th=10^{-4}$. For $k=3$ and $th=10^{-4}$, liquid is simpler than air and the fluid converges to a constant value around the edges while the fluid residuals uniformly disappear up to the global accuracy level $2^{-8}$ for both fluids. Thus, the fluid residuals converge to some constant value due to the drift-freeness effect in this problem and in conjunction with the preconditioner recovers the expected error. However, this behavior is not existent for smaller thresholds, we expect linear behavior in $th$ for $th\\ge1/4$. However, we observe that air converges to a large value around the edge while the liquid value remains smaller and consequently the fluid residuals disappear up to the global accuracy level $2^{-8}$. The disagreement between the predicted error and the actual error results, suggests that the estimator on the fluid side has not been tested carefully"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method will again be governed by the local in time and in space interpolation steps for both phases \\cite{Hirschfeld2008}. Importantly, in order to allow for a clean design of the numerical method, we postpone details on the solid interpolation steps to \\cite{MSP2017}. Only the overall principles and the time scale will be covered, given that adaptive solid interpolation is a well established technique (see \\cite{MSP2017} for a review and implementations in \\cite{MSP2017rl,MSP2017ru,MSP2017uni}). As in \\cite{MP2007}, the time evolution of the solid is characterized by three time scales. The first (fastest) order corresponds to biochemical reactions (reproduction, enzymatic processes, etc.), the second (intermediate) to reorganization processes in the protein conformation space, e.g., adaptation to environmental conditions, and the third (slower) order to structural evolution. All of the is"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to about $1.2\\%$ in the scanning frames compared to $1.7\\%$ in the non-scanning frames. The results show that scanning frames improve the accuracy by reducing the effects of barycenter changes on the same floor as well as floor transfers. Besides, the results confirm that the addition of the angular information to the maps through the use of scanning frames helps in improving the accuracy by about 20\\% as opposed to not using the angular information. The results also indicate that the use of a closer angle transfer fraction (ATF) improves the accuracy \\textit{only} in the single floor case. However, the use of a closer ATF combined with scanning frames improves the accuracy by about 15\\% in the inter floor case as well. Besides, the use of a closer ATF combined with the variable OT results in about 26\\% higher accuracy in the inter floor case compared to the non use of closer ATF. It is worth noting that the effect of the ATF on the accuracy is not very significant in the single floor case. However, it"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that continuity of the zero-radical aspect of the spherical harmonic expansion of the CME plasma density implies that CMEs are not randomly oriented in space. It follows from the previous section that the same thing holds for the rotation rate of the CME-AR system. In other words, george-parks-CME is not a random collision between the CME and S/C, but a very specific event, whose exact shape can be fully determined several days in advance if the observation of the solar wind at large distances (at least 8 solar radii) is wide-angle and of high resolution.  It is also shown in \\cite{Valgushev:2015sae} that the rotation direction of george-parks-CME changes during the S/C motion evolution, which is a consequence of the evolution of the plasma and magnetic field properties at the CME-AR and S/C-platform boundaries, the interactions between them, and the transformation of the propagation direction of the system. In particular, the george-parks case may occur"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Magel2018}),  the optimization procedure in \\eqref{objectispprof} is still an empirical procedure for the time being. Finding parameters $\\bm\\Theta$ and $\\bm\\Lambda$ which satisfy the tangential cone condition is not possible in general, and it may not even be possible to derive a subset of the Euclidean space in which the condition is satisfied. However, as we showed in the beginning of this section and in \\cite{Crivitz2019}, a direct relevance of the optimization procedure in \\eqref{objectispprof} to real-world applications can be obtained by selecting the weight matrices $W_1,W_2$ according to \\eqref{selectiveconvention}. Thus, the tangential cone condition \\eqref{tangentialconecondition} is satisfied in the subset $C_{W_1,W_2}$, which can be considered as a small sub-hypercube (see \\cite{Crivitz2019), where the solution of \\eqref{objectispprof} is valid. As examples"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradient. The gradient estimates obtained by the GKB method has been shown to satisfy several criteria for efficient rules for the landweber iteration. In \\cite{Cover18}, a heuristic strategy was proposed for the landweber iteration based on the gradient estimates obtained by a more sophisticated iteration called the p-filter. This heuristic strategy suggests to choose the best $p$ filters (on $m$ parameters) out of all the filters. It was observed that the proposed heuristic strategy solved quickly a problem or demonstrated that the problem is solved. However, there is one problem on the experimental results in \\cite{Cover18}. The main argument in \\cite{Cover18} is to conclude that the landweber iteration pushes the problem towards the optimal filter and hence the p-filter iteration. This argument is based on the following observation: the first minimum after lifting the problem to a higher dimension then the number of filters then suddenly becomes the global minimum including the original dimension problem. This observation is not valid for the landweber iteration with $n=2$ and $p=3$"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " consistency of the federal node controller under the locally closed-loop system, which means that the function of the controller cannot be viewed as the response of the global law, but to a certain extent it is true. The convergence of the output variables is locally convergent, i.e., for the output variables close to initially converged ones, other output variables will go to the right states and eventually converge to the right states. \\cite{COOPERATIVECONTROL} mentions that the theoretical framework could be extended to address the global convergence, but the computational burden is too great. The global convergence of the law of the leader cannot be obtained in this paper, but it is agreed that the local law converges to a fixed point. We cite from \\cite{COOPERATIVECONTROL} for confirmation. In their algorithm for converging to a local equilibrium, the output variables of the users are updated, and the federal controller at the users is updated. We continue to use the expression \"local equilibrium\" in this paper because it does not affect the understanding of this paper.  In \\cite{FederalNLP} it is stated that the"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the fluid circulation. The $N-2$ dimensional boundary $b_J(\\mathbf{x},t)\\in \\mathbb{R}^{N-2}$ is called the ``boundary flux\" and $\\nu_J(\\mathbf{x},t)\\in \\mathbb{R}$ is the velocity magnitude, or velocity magnitude velocity, which models how strongly the fluid particles are moving, or in other words, how ``viscid\" the fluid is. This system is a generalization of Dubin's car dynamics and quasi-linear theory is the first method to study such models. Essentially, the quasi-linear theory provides a bifurcation theory for systems of  Dubin car types, and study (local) stability of stationary heads up flows \\cite{DSWStyle}. Later, a system of Dubin car types is used as a potential candidate for water wave problems with a layer of fluid over a rough surface, and <cite>ISVV...</cite> presents a quasi-linear wave problem into a quasi-linear PDE model and provides a bifurcation theory for that model. However, the classical quasi-linear"
        }
      ]
    },
    {
      "train_loss": 0.035873046875,
      "update_counter": 3500,
      "epoch": 112.87096774193549,
      "val_loss": 6.37890625,
      "val_entropy": 0.671142578125,
      "val_ttr": 0.5911458333333334,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of our theory and simulation results for the sharing-based allocation policy. The parameter values used in our calculation are the same as those in Figure~\\ref{comparison_whole_timeline_configuration_1}. In this policy, an ARFCN is assigned to a packet and then the ARFCNs are sequentially assigned to the following packets, until the packet is finished the selection. Therefore, a packet's arrival time has a significant effect on its assignment process. High throughput packets generally have low probability to be the first to arrive at the system. Therefore, we focus on the case where the arrival rate $\\lambda_s=0.05$ (i.e., half of high throughput packets have low arrival rate) and the arrival rate $\\lambda_s=0.1$ in the figure. When $\\lambda_s=0.05$, the arrival rate is low enough for a packet to be the first to arrive at the system. In that case, the system performance is always similar to that of optimal allocation. However, when the arrival rate is high, the selection process for"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the domain $\\Omega=(0,T)$ with $T<1$ and a uniform mesh over the domain $\\Omega=(0,\\sqrt{3})$ with $\\sqrt{3}<1$ in order to test the estimator on meshes that are not ideally suited for the investigated problem. The convergence of the a posteriori error to the desired rate $2\\frac{h_X^2}{W_1^2+\\omega_2^2}$ in Table~\\ref{fluid_residuals_uniform_smaller} indicates that the effect of the mesh on the estimator is negligible. However, the a posteriori error estimator does not yield valid results on the sequence $\\mathcal{T}_1 = \\mathcal{T}_0^{X,m} \\times \\mathcal{T}_0^{F,m}$ with $m=1,2,3$ in Table~\\ref{fluid_residuals_uniform_equal} and Table~\\ref{fluid_residuals_non_uniform_equal} because the time triangles erode and the time rectangles grow with each iteration of the pre"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method depends on the solid subproblem, since the fluid problem can in principle be solved accurately to any desired precision. Therefore, a balanced solution strategy for the fluid \\textit{and} solid subproblems is of utmost importance in the method. In contrast to existing balanced subproblem strategies, a novel procedure for setting stepsizes is proposed in this work, based on a perturbation theory for coupled systems. The overall convergence estimate for the overall error of the method entails contributions from the fluid stepsize perturbation, the solid stepsize perturbation, the solid boundary perturbation, the fluid-solid coupling and a coupling convergence factor that depends on overall convergence parameters for both subproblems. In this work, we construct the overall convergence parameters from the projections of the solution onto a basic solution space and solution derivatives along solution spaces. The obtained overall convergence estimate is quite robust, showing a typical accuracy of around $10^4$ for solutions to experimental accuracy of 1\\%. This is important in view of the potential higher order orders in frequency scale compared to existing strategies for the higher order methods. Furthermore, it is shown"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "16\\% down to 2\\% in Fig.~\\ref{fig:scan_floor2}, compared to $5.1$ GHz , where such improvements are not observed. The results are also compared in Fig.~\\ref{fig:targ_err_2} with a target visibility of 0.35, where it can be observed that the $2.4$ GHz backhaul has a target visibility of 0.55 and $2.5$ GHz frontage has a target visibility of 0.46. The table in~\\cite{hansen_throughput} shows that the $2.4$ GHz NB-IoT network has a larger capacity than the $2.5$ GHz RAN. Therefore, the relative error statistic is a good indicator to show the capacity loss one would experience if one uses the $2.4$ GHz configuration instead of the $2.5$ GHz configuration. Therefore, one can safely conclude that a proper scanning policy can significantly improve the error rate even for a network operator with a large coverage. Furthermore, in the appendix"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "gfa}, the nonlinear analysis of a unique magnetic structure observed across the CME, in conjunction with the coronal hole from which it emerged, showed that both the CME and the coronal hole were significantly thicker than previously assumed, and concluded that their combined mass load was currently insufficient for disruption. In other words, the event is currently sitting there with little consequence. However, there is no such subtlety in the \\cite{2016Natur} report, which describes the CME as `on a leisurely pace', `meandering its way', `meandering and slowing down...for at least a week', and `more sluggish than normal'. Clearly such a long meandering period, which never reached any significant acceleration, would not produce the powerful rejection energy required to disrupt the object. Instead, the concept of the `notable impact' must be employed. In other words, the unique characteristics of the CME and coronal hole interact to ensure that the CME slows down to a halt directly at the COVID, with the only difference in trajectory being caused by the COVID rejection effect. Since the COVID surrounds both the CME and the CME"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Magel2018}),  the matrix value function $W:\\mathbb R^p \\times \\mathbb R^p \\rightarrow \\mathbb R$ specified by this approach does not exist as a concrete function.  To give a concrete starting point, we assume that the correlation matrix $\\Sigma$ is given.  Note that we could also assume that $\\Sigma$ is a proportional error correlation (of any type), since the latter to the former is equivalent by a rotation of dimensions and a \\emph{block-diagonal} matrix multiplication with a Vandermonde matrix (see \\cite{Ruud2015}).  In either case, the corresponding VAIUO would be very expensive to compute directly.  To make things worse, in order to derive the value function specified by \\eqref{Wformula}, a measure need to be defined (see \\cite{Ruud2013}).  The measure used in \\eqref{Wformula} can also not be explicitly computed.  When either $\\Omega$ or $\\mathcal{X}$ is"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradients. In the original version of the Landweber iteration the gradient is easily efficient, while in the strengthened version \\cite{cuthbert15} introduced a simplified form of the gradient, which is effective also for suboptimal states. However, in the study of the iteration for initial values on the boundary of the boundary layer, the strengthened version has been used. In Figure \\ref{fig:sparsity_critical2} we reported the results obtained using the two versions of the gradient. The initial value for the study has been taken to be the same as the one in the critical point analysis (see Figure \\ref{fig:SESTD_critical}). The effectiveness of the efficient gradient is critical at the early iterations, since the simplified version of the gradient leads to a rapid decrease of the errors and then a sharp jump back up to system $S_1$. This jump implies the appearance of a heuristic signal, which indicates the existence of a local minimum. However, this local minimum is falsely, as shown in Figure \\ref{fig:sparsity_critical2}, it is connected"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of each individual sequence and then conclude the collective convergence. The local convergence theory implies that the underlying global structure can be \\emph{always} ignored during the convergence procedure, and all other structural changes will happen locally within each sequence, which makes the design and application of the algorithm very convenient and unique. As a special case, the local convergence theory allows us to conclude that if a sequence $\\{x_{k}\\}_{k=1}^k$ does not converge to an optimal solution, then each $\\{x_{k}\\}_{k=1}^k$ will have a worse $\\ell^2$ convergence rate than one that is optimal for an optimal solution. An illustrative example is demonstrated in the accompanying video \\cite{fig5}. For comparison, the convergence theory in the linear case implies that the final solution will also converge for all sequences even if it converges locally. In addition, the convergence rate in the linear case is always the same as the $\\ell^2$ norm of the \\emph{initial} data.  The linear convergence theory indicates that if a sequence $\\{x_{k}\\}_{k=1}^k$"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the fluid density with $j\\in\\mathcal{J}$ (for $j=1,2,3$ for $N=2,3$); here $\\mathcal{J} \\subset \\set{1,2,3}$ is the set of dimensions. Then the incompressible Navier-Stokes equations are used to describe the flow of the fluid with pressures: $\\left(\\sqrt{g} \\nu u_j - \\Delta_g u_j \\right) + \\left(\\mu \\epsilon u_j - \\epsilon \\Delta_g u_j \\right) = 0$, $j\\in\\mathcal{J} $, where $\\nu, \\mu$ are the characteristic parameters of viscosity; $u_j$ represents the $j^{th}$ component of the velocity; $\\Delta_g$ is the Laplace operator with respect to the metric $g$, and $\\epsilon$ is the only parameter related to the dimension $N$ that determines the speed of viscous dissipation of the fluid; in our problem, $\\epsilon=\\sqrt{2}$. This physical"
        }
      ]
    },
    {
      "train_loss": 0.0661787109375,
      "update_counter": 3750,
      "epoch": 120.93548387096774,
      "val_loss": 5.87109375,
      "val_entropy": 0.726806640625,
      "val_ttr": 0.6197916666666666,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the change of the predicted timeline relative to the ideal timeline for both configurations. In the first case we plot the predicted timelines for both the maximum likelihood \\textit{only} and the Bayesian backtracking \\textit{and} likelihood fitting algorithms, while in the second case we show the performances of both algorithms without putting into account the putative correlations between the events, and the coordinated performances of an algorithm taking into account such correlations and not (both considering the maximum a posteriori probability interval as a benchmark). The figure shows how the accuracy of both algorithms can be improved by taking into account such correlations. The perimeter of the conformational landscape shown in Figure~\\ref{comparison_whole_timeline_configuration_2} is larger than that shown in Figure~\\ref{comparison_whole_timeline_configuration_1}, because the take into account of correlations between the events leads in some cases to an extension of the estimated conformational space. This effect is more pronounced in theBayesian backtracking algorithm, since it takes into account also the putative movement inside the free energy basin. The estimated confidence intervals shown in this"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $3$-by-$3$ rectangle. In the first two rows we consider the simple energy estimate $e(u) \\leq e(f)$ for the fluid solution $u$ computed by the GA$\\&$PM method. On grids $K_h$ sufficiently chosen with respect to the final time $T_{\\rm fin} = 1$ the error appears to be globally uniform, but it decays rapidly on sub- grids. For example, on the sub-grid $K_h^{(1)}$ the error is in the form $e_h = O(h^6)$ and it decays by another order of magnitude on the sub-grid $K_h^{(2)}$! This behavior is also reflected in the asymptotic relationship $e_h \\sim_h O(h^6)$ near the boundary between grids $K_h^{(1)}$ and $K_h^{(2)}$. We suspect that the error behavior is caused by a very thin corner near the top left corner of the domain $\\Omega$, where the solution is not well approximated by smooth triangles. We"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method will again be defined as in the single-rate case: assuming that the solid discretization is uniform and the fluid fluxes change only at the interfaces between solid sublevels, the overall error in the relative formulation is of order $6\\frac{\\Delta t_{s}}{\\Delta t_{f}} + \\Delta X \\Delta Y \\Delta Z \\frac{1.5 \\Delta t_{f}}{D_{f}\\Delta t_{f}}$, where the time-step in the fluid sublevel is $\\Delta t_f$, the time-step in the solid sublevel is $\\Delta t_s$, the spatial resolution is $\\Delta X \\Delta Y \\Delta Z$, the number of degrees of freedom in the fluid is $D_f$, and the total number of degrees of freedom is $D_f + V_{s} = D_f + n_s$. As we will see in sections~\\ref{sec:numerical_tests_double} and~\\ref{sec:numerical_tests_triple}, the error estimate will often be overestimated. In"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to about \\(2.2\\) cm (\\\\(R_c=90\\%\\)) in the first \\(100\\) ms of the scan. Above \\(100\\) ms, the frames attract increasing number of phantom entities and the errors increase again. \\(100\\) ms seems to be a critical time span in which the phantom entities are not yet accumulated and the linearization of the problem effectively remains unchanged. However, the errors quickly rise above \\(3\\) cm again as the phantom entities get accumulated. \\(T\\) (the time interval of the data acquisition)\\small/\\\\(T_{scan}\\) the frame rate of the scanning frames, is an important parameter that affects the accuracy of the results. Higher \\(T_{scan}\\) causes more phantom entities accumulating on the different floors and therefore, a shorter time between each frame is advantageous. However, higher \\(T_{scan}\\) keeps the floor less error constant and even reduces the layer error by about \\(10\\%\\) in the $2.4$ GHz RF range. This is due to the fact that the layer interference"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that continuation of rotation of the CME about the white line coincides with rotation of the internal structure of the CME, called a ``shell'' in this work. Consistent with this statement, the study showed that CMEs with similar internal structures maintained the same rotation period. Thus, there is a certain structure inside the CME deflected away from the white line and which affects the motion of the CME. This also means that the line of symmetry of the equilibrium cookie no longer holds true and the previously proposed classification of ``interceptor'' and ``steered by the west'' is no longer valid \\citep{1983sowi.conf..192A}. In fact, calculating the direction of deflection of the CME from the white line, one obtains a clear conclusion that CMEs are deflected by interaction with the coronal field of the Sun and not due to a cut in the coronal base AIA temperature of 171 (see, for example, the cuts in the work of \\cite{2011ApJ...743N.117"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015, MAGAR2018, Kipnis2018}) the study of the stability of the system for such statements remains for the most part an experimental problem. In order to overcome this limitation we introduce a new level set formulation for the conformational dynamics of DNA described by the Kipnis model. The model of MAGA\\/{R} (see Eq.~\\eqref{KipnisMAGAR}) is formulated for right-handed B-form DNA with a\u5de6\u624b-handed tangential cone, which is a temporary state of the molecule, implanted at a certain probability $p_i$ on the surface of each chromosome. For right-handed B-form DNA, the original Kipnis model (without the tangential cone condition) is formulated by introducing a stationary tangential cone which is embedded in the right-handed B-form DNA background \\cite{Banfitz2017}. In order to model polymerization into a thicker filament we modify the model parameters of Kipnis and introduce a threshold concentration $p_i^*$ for implantation of the"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradient. The gradient estimates play a crucial role also in the Landweber iteration. The local approximation for the gradient of the kernel function $g_{\\lambda} (r) = \\frac{1}{4} (r^2 - 3)$ shows a peak at $r = 0$. This peak appears also in the gradient estimated by the gradient descent algorithm at $r_k = 0.5511$. Although the loss coefficient is independent from the learning rate, the special structure of the gradient estimates can make the iteration fails for high fixing coefficient values and small learning rates. In addition, when $\\lambda = 1$, the gradient estimates are and the boundary condition set by the heuristic suggests that the learning rate should be large. We have verified that these conditions produce the spurious first minimum. However, we confirm that this minimum is caused by the failure of the iteration and is not produced by a successful search~(see Figure \\ref{fig:Layer_hyper_search}). For this reason, the algorithm cannot produce a solution for this value of $\\lambda$. In practice, we"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the gradient, even though the overall objective function is still converging. It is easy to see that the gradient converges to a certain extent means that the function is well controlled by the gradient, and therefore, the location of the function vector is going to be correctly updated even if the overall value may not decrease. With this perspective in view, we argue that the gradient convergence in this manner is more than sufficient for the overall convergence. Also seen from this perspective, the local formulation of the problem in the regular case enables us to prove the gradient convergence while we cannot establish the global convergence in the nonlinear case where the domain of the problem is an entire space $\\mathbb{R}^d$. On the other, we observe that many recent studies on the numerical gradient methods \\cite{Enelcio2017, Enelcio2018, Enelcio2019} rely on the above gradient convergence theory to evaluate when the numerical gradient converges and can therefore be used as an implementation technique. This means that the proof of the gradient convergence in the regularization case completely suffices such an implementation. In addition"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the gradient for $j \\in \\set{1,2}$. We call such a system $F: \\Omega \\times V \\to \\R^{D}$ a from-domain gradient interaction formulation. The interaction term $g_j \\cdot$ in $j \\in \\set{1,2}$ can be formally written as $g_j \\cdot v = \\sum_{i=1}^N a_{ji}(x) _i \\mu_i v_i$, and is often called a semi-discrete form. Here $\\mu_i = |\\mathcal{J}(x,v)|_{L^2}$ for $i \\in \\set{1,2}$ is the junction condition (JC) of $v_j$ on $x$, and $a_{ji}: \\Omega \\times V \\to \\R$ is the $L^2$-compatible function with $\\bigcup_{x \\in \\Omega} \\bigcup_{v \\in V} \\mathcal{J}(x,v) = \\{a_{ji}_j(x) \\}_{j"
        }
      ]
    },
    {
      "train_loss": 0.0359521484375,
      "update_counter": 4000,
      "epoch": 129.0,
      "val_loss": 6.359375,
      "val_entropy": 0.671875,
      "val_ttr": 0.61328125,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison between our theoretical predictions and simulation results for the early moments of the first largest explosion in the model A and model B. In the figure we emphasize the times when mass deposition rates in the primary source vary between $\\pm$ $10^{-4}. Then the main characteristics of the headwaves' growth are defined. The maximum speed of the headwave advancing along the stream tube increases by about a factor of 10 from the initial (very small) value, reaches its peak, and the details of the headwave advance gradually become clear. The advance of the leading edge of the headwave is well reproduced by the proposed model, which advances with the same speed and eventually forms a shock wave. The only difference that our theory does not describe is that the shock wave is always well bounded by the roof of the stream tube. Similarly, the characteristics of the second largest explosion in the model A and model B are shown in Figures~\\ref{comparison_whole_timeline_configuration_3} and \\ref{comparison_whole_timeline_configuration_4}. The main differences between our theoretical predictions and simulations"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $2^{\\times}5$ cube. The grid size tends to $2^{2-5/d}$ as $d \\rightarrow \\infty$. We choose the prefactor $A = \\sqrt{2} (N^{-1/4} + N^{-1/8})$ and $\\lambda = 1$ for simplicity. For the a priori error estimate we have chosen the coefficient matrix $A$ as suggested by the discussion right after Theorem~\\ref{thm:fluid_estimate_inf_h}, whereas we fix $\\lambda = 1$ by a direct numerical search for the optimal $\\lambda$ on a single grid point realization. As $N \\rightarrow \\infty$, $A-\\Delta$ should become homogeneous of degree $-d$ around the center grid point; for this homogeneous function with the chosen value of $A$ $\\lambda$ will approximate $\\lambda = 5d$ exactly. Table~\\ref{fluid_residuals_uniform_equal} shows that our algorithm correctly finds this optimal $\\lambda$ on the homogeneous grid and produces approximate errors that accurately reflect the $"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method will again be defined as in the single-rate case: the solid angle is integrated with nStepSteps discrete steps; the fluid angle with a combined $\\alpha$-step-of-size subiteration. The overall convergence is defined by the following three parameters: first, the initial guess for the solid structure and fluid mechanical parameters; second, the number of subiterations used for the fluid mechanical solves; and third, the total time-step for the overall iteration procedure. We define the expected convergence based on these parameters by defining an initial error at the end of the overall first iteration step as the rms length element of the solid structure. This initial error is further discounted by a factor $\\alpha$ (see Figure \\ref{fig:convergence). We define the total number of iterations expected for this error to be reduced to some $\\varepsilon$ (say $10^{-5}$) as the number of overall iterations. We performed a simulation with these parameters on a model with approximately 300k nodes and found an initial error of about $2\\times10^{-3}$"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": " $N_f = 10\\%$ \\cite{Ciasullo2017}. In Fig.~\\ref{fig:image5}, we show the movement of the simulation grid in the $2.4$ GHz signal as the frames are moving down to the bottom floor. The simulation is done with an  grid size of $100 \\times 100$ meters, to cover an area of $2 \\times 2$ km2. We choose a time window of $5$ minutes and conduct the simulation over a $240$ minute day. Here we observe that the fact that the values of acceptance from the scanner frames are constant in high-frequency signals, even the floors with a high change in the elevation. As can be seen in the figure below, the values of acceptance are all $0.5$, even for the last $2$ frames (highlighted in red), which contain the lowest elevation messages. We discuss this result, which can be seen as a statistical property of the Gaussian signal, in the \\cite{Tayfun_SIN"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that continuity of the magnetic field along the CME implies that either the CME does not develop a velocity within the solar system or that it has also a radial velocity. In the latter case, the body is not electrostatically charged, since the ion density profile during the propagation is constant. Moreover, assuming there is a radial velocity of the CME body, either there is no deflection of the global magnetic field by the ejecta or the field is altered. In either case the WSA is not a proper theory to describe this scenario. Indeed, in WSA, it is assumed that the magnetic field is unperturbed by the ejecta. Thus, if the CME has a velocity, this deflections either occur in the magnetic field of the Sun or in the solar wind particles, which are responsible for the electric charge of the ejecta. Either of the two cases is considered in more detail in \\cite{Shi:2013gfa,Shi:2014gmo}. In this work,  two mechanisms of the electric charge of CMEs resulting from the interaction of the"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015, MAG0810,MAG1311}), until now there are no general theoretical results for the expectation value of the return time to prove the condition is robust under small-time corrections. The exception is the expected return time to a single destination in a traveling salesman problem (TSP), where various expectations with respect to different cost functions have been analyzed in \\cite{Baraud2003,Humer2007}. We will extend the results in \\cite{Baraud2003} to derive the expectation of the return time to a destination for a path with small perturbations in the direction of . This is important for accurate simulations because it allows to evaluate the expectation of the return time to a destination to a good accuracy without having to evaluate the expected return time to the destination itself. The expected return time to a destination in a TSP is useful for various applications, including graph theory \\cite{Baraud2003,Schneider2010}, image processing \\cite{Burges1996,"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradient, we test the efficiency of estimating the gradient using the GKB method for different settings of the initial boundary condition (IBODF) for the diffusion equation with white noise and periodic boundary condition. The IBODF for the diffusion equation of size $N=18$ is defined as  $A_{i,j}^N=\\frac{1}{N}\\mathbb{I}_{N^2}\\delta_{i,j+1}$ for $i,j\\in\\{1,\\cdots,N\\}$, where $\\mathbb{I}_{N^2}$ denotes the $N^2\\times N^2$ identity matrix and $\\delta_{i,j}$ is the delta function for $i,j\\in\\{1,\\cdots,N\\}$. We test the time cost and the efficiency of the gradient method for the AICG and LCG. For the AICG, we set the \\textit{MaxStep}$=\\{200,400,600,800,1000\\}$, the \\textit{MaxIt}$=\\{2"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the approximate solution in a neighborhood of each initial condition even though we lose the assumption $g\\equiv 0$. There are many cases where we have this noise assumption, e.g. $f(\\mathbf{x},t)$ is a stochastic process, $(\\mathbf{x},t)\\in\\mathcal{P}\\coloneqq(\\mathbb{R}^{d}\\times\\mathbb{R})\\setminus K$, where $K$ is a reference system such that $\\mathbb{R}^{d}\\times 0\\subset K\\subset\\mathbb{R}^{d}\\times\\mathbb{R}$. In such a case, the noise is always directed to the outside of the reference system $K$, i.e. $\\mathbb{P}\\coloneqq\\mathcal{P}\\subset\\mathbb{R}^{d}\\times 0$, and we can derive optimal controllers to make the state take steps in the opposite direction and to reduce the space of attraction $\\mathcal{P}$ by using the theoretical foundation of the linear-quadratic-Gaussian (LQG) theory \\cite{LQG}. The state then will"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the surface $a$ facing $\\Omega$ for $j\\in\\set{c,m}$. We solve the problem of modeling the surface $m$ facing the boundary of the domain $\\Omega$ by assuming that condition $u = \\pm \\max_{_{x \\in \\partial \\Omega} u(x)} $ is satisfied at the corners of the steps. When none of these conditions are satisfied, we only solve the problem for one side of the boundary and assume that the function $m = u^{-2}$ is continuous in a neighborhood of the boundary. \\cite{Schaffer2003, Bertini2006, Sikirov2001} it is not possible to assume that $m$ is continuous even at the corners of the steps. Next, an estimate of the error which occurs in transforming the data from the domain-based problem to the boundary-based problem is given and this estimate is shown to be valid also in the problem of modeling the surface $m$ facing the boundary of a smooth version of $\\Omega$. The smoothness of $\\Omega$"
        }
      ]
    },
    {
      "train_loss": 0.0337529296875,
      "update_counter": 4250,
      "epoch": 137.06451612903226,
      "val_loss": 6.541015625,
      "val_entropy": 0.66015625,
      "val_ttr": 0.6188151041666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison between our theoretical predictions and the results of the two configuration discussed in Section~\\ref{discussions} for MTS frameworks. We obtain the plots by setting $k_z = 10$ in Eq.~\\eqref{eq:estimate_sdhz_config_1} and $k_z = 20$ in Eq.~\\eqref{eq:estimate_sdhz_config_2}. Also, the mean transmission $\\mu_k$ is computed for the two configurations and used to color the lines of the graphs corresponding to MTS with a binary potential. Notice that the output noise variance $\\sigma^2$ for both configurations is chosen so that the mean values of the noisy SECTs are similar. However, this does not guarantee that the results are equivalent\\footnote{In fact, as the choice of $k_z$ affects Eq.~\\eqref{eq:estimate_sdhz_config_1} only near to the transition frequencies, our theoretical predictions should be valid even for large choices of $k_z$. In the two shown timelines the lines corresponding to theoretical predictions are"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $2^{i}\\times2^{j}\\times2^{k}$ cube, where $i,j,k \\in \\set{0, \\frac{1}{2}, \\frac{1}{3}, \\frac{1}{4}$}&$\\frac{1}{5}$, $\\frac{1}{6}$&$\\frac{1}{7}$, $\\frac{1}{8}$&$\\frac{1}{9}$, $\\frac{1}{10}$&$\\frac{1}{11}$, $\\frac{1}{12}$. The grid size is $H=1$ so that the time steps are all equal to $\\frac{1}{120$. We set $\\alpha=\\beta=0.1$ and use the prefactor $k_{max}=10$ in Eq.~\\eqref{fluid:ap_estimator}, where $k_h$ is defined in Eq.~\\eqref{fluid:HKH}. We compute the actual error $e_{n}^{(1)}$ of the $n$-th iteration of the finite sum in Eq.~\\"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In this chapter we discuss three different solid sub-solvers: (i) a pure Lagrangian solver (LLASL), (ii) a cached numerical force-displacement solver (CND), respectively (iii) a solid coupling in fluid solution steps (SOCFIL), which are adapted to different time-steps in each problem. The pure Lagrangian solver is slow but very precise. The cached numerical force-displacement solver has a much higher time-to-error ratio but requires many discrete solid steps, which is problematic for high resolution problems in which the solid motion has to be accurately solved. The cached numerical force-displacement solver is advantageous for problems with large reference sizes where the hydraulic time-stepping has to be reduced. The highest advantage of SOCFIL is that it can be used for very high resolution problems with a large number of solid nodes as well as small reference sizes: the solid coupling in hydraulic solution steps does not add significant overhead to the solid subroutine. \\cite{Yang2020} demonstrated SOCFIL with tangible advantages for problems with different reference sizes. Figure"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to about $1.0\\%$ in the high frequency range, where the frames are adequately resolved. Viewed from this point of view, scanning is a highly effective strategy to use the RF hardware to reduce the number of errors in a complex building. However, over the course of a sampling period, the same frames are reused, thus, affecting the information content of the resulting estimation function, as shown in Fig.~\\ref{fig:error_graph}. Thus, in addition to the framing strategy, the power series transformation (or wavelet transformation in \\cite{bremaeck2010novel}) is successfully applied to generate a power series image, which is small and stable over the whole sampling period. This is because, by applying this transformation, the spatial resolution is very high at the beginning and the time resolution is high towards the end of the sampling period without mixing frames. In this way, the power series image has a very flat curve which is stable throughout the period. This allows us to use a power series image to help in the calibration of the GPS-RAE system on top"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that continuity of the zero-radii surfaces of the identical surface revling and identical line elements constrains the motion of streams originating at the source and renders the system in equilibrium. However, this condition is not satisfied for the CME and AR analyzed by us. Two CMEs (Fig.~\\ref{fig:fig00001}) have left the Sun approximately simultaneously and flew almost simultaneously to Earth, arriving at {\\emph{PSP}} approximately simultaneously as well (Fig.~\\ref{fig:fig00003}). One of the CMEs has rolled over into a second, which preceded it near the Sun (Fig.~\\ref{fig:fig00002}). Clearly, the two CMEs cannot both have originated from the same region of the solar surface. One of the CMEs likely grew subsequent to the other at the F-region level and beyond. It is obvious that one of the CMEs will no longer maintain connection to the source region and will become a random walk of cosmic traffic, bounded by the selection region for the solar wind (see"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015CMV, Kipthong2017JCMV}), so far most of the studies have focused on the case of \\emph{deterministic} trajectory predictions (popularly known as thinned jet streams or deterministic predictions on the whole interval), which are simple and easy to understand. The recent work of Kipthong et al. (PhD thesis \\cite{Kipthong2017}) showed the applicability of the tangential cone method to stochastic predictions. The application of the probabilistic tangential cone method in determining the probable trajectories can be considered as an extension of the method of fragments \\cite{Strasenberger1995} to the case of stochastic data. Similar to the case of fragments method, information available in the cumulative flight time duration from the probabilistic forecast can also be used to determine the probable trajectories. Note that the predictive interval containing 80\\% of the cumulative flight time predictions has precisely the structure as the tangential cone. This was noticed by Kaltenbacher et al. (University of Alberta, PhD thesis \\cite{Kal"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f^\\prime(x^*)$ and $f(x^*)$, which are always allowed to have inaccuracies. In the case of the fictitious first minimum, we notice that because $f$ is not bounded from below, even if the IPWS is not applied to $x^{(0)}$, there may still exist an outcome for which $x_{k+1} \\geq x_*$. If we assume that efficient estimates for $f^\\prime(x^*)$ and $f(x^*)$ are known, then the fictitious minimum is indeed the minimum and the Landweber iteration is able to draw it to $\\N$ (see \\cite{Evans:2006vh}). However, in reality, accurate estimates for $f^\\prime(x^*)$ and $f(x^*)$ need to be obtained by a small number of numerical evaluations of $f$ and $f^\\prime$ and therefore it is allowed to have inaccuracies. In this case, the iterative iteration may finally draw $x_{k+1}$ to $x_* > x_{k++1}$, which"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the approximate solution in a neighborhood of each initial condition even if the solution may be non-continuous. This phenomenon observes regularly by authors of numerical scientists around the world. To alleviate this problem, we prove that the optimal solutions with respect to $L_q$ integrated norm actually converge in all areas where the $L_2$ solution is continuous. The $L_2$ solution, as it is known, is not appropriate for algorithmists to implement their algorithms since it is sensitive to high-frequency perturbations. We prove this convergence effect in the paper \\cite{cai2018solution}. This technique not only can avoid the algorithmists effort to derive the exact solution but also makes the algorithmists to implement our algorithm exactly same as the linear case. For the completeness of proof, we first pass to prove the convergence of $L_2$ solution. Then, we also pass to prove the convergence of optimal solution with $L_q$ integrated norm. The convergence indices are also important to discuss in the paper \\cite{cai2018solution}. Finally, we compare the"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the surface $u = g_j$ on $\\partial \\Omega$, $j = 1,2$. We say that $u$ is a result of solving an ``average-based'' problem ($Q_ {et}$) if the parameter in the average function is approximately $\\frac{1}{N(\\Omega)}$ (otherwise we say that $Q_ {et}$ is done strictly), and $N(\\Omega)$ is the area of the domain $\\Omega$. By $Q_ {et}(K)$, we will denote the average penalty, that is the term $\\frac{1}{N(\\Omega)} ||u - u_{et}||_E$. If $E = \\ell_2$, we keep the word $et$ unchanged, while for $E = \\ell_1$ the average penalty is of the form $Q_ {et,_1}(K) = \\frac{1}{\\lambda_1^{et}(K)} = \\frac{1}{\\inf \\left|\\lambda_j^{et}(K)\\right|}$. We remind that $\\lambda_j^{et}("
        }
      ]
    },
    {
      "train_loss": 0.033861328125,
      "update_counter": 4500,
      "epoch": 145.1290322580645,
      "val_loss": 6.67578125,
      "val_entropy": 0.6527099609375,
      "val_ttr": 0.6236979166666666,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the agreement (green lines) and disagreement (red lines) of the reference velocity field in the case of configuration \\emph{X} and \\emph{Y} (see Table~\\ref{config_name_objectives} for the references), respectively. The agreement is overall good but not perfect. The disagreement increases in space away from the mean positions of the agents ($y=1.5$ in Figure~\\ref{comparison_whole_timeline_configuration_2}). However, in this spatial location the core shifts towards $B$ in the first half of the simulation (see Figure~\\ref{comparison_whole_timeline_configuration_1}(a) in the case of configuration \\emph{X} and Figure~\\ref{comparison_whole_timeline_configuration_2}(a) in the case of configuration \\emph{Y}). Also, the agreement is fast in getting worse in the tail of the time axis. However, in Figure~\\ref{comparison_whole_timeline_configuration_1}(d) it can be seen that the disagreement in the core does not produce a significant difference in"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $2^{i}\\times2^{j}\\times2^{k}$ square. The grid size seems to grow linearly with the size of the domain. This can be explained by the observations after solving the problem on mesh $M^{(3)}$ in Table~\\ref{fluid_residuals_uniform_equal}. We also observe that the a posteriori error seems to be lower than the approximation error of the finite element method in the case of uniform meshes. Thus, the error estimation proposed in this work seems to work well even in this simple case when the discretization does not change. In Table~\\ref{fluid_residuals_non_uniform_less} we present results for the nonlinear estimator on a sequence of time meshes that are non-decreasing in area and are generated by refinement over the first three dimensions (i.e., we have $2^{i-1}<2^{i}\\leq 2^{i-1}+2^{j}$). In this case, the first two dimensions are fixed, and we refine the third dimension by factor of $2$. We choose the time"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In the fluid subproblem the convergence through the solution of preconditioned linear equations shows the importance of the adaptation. We choose the preconditioner based on a GMRES procedure with a \\textit{soft-thresholding} preconditioner. The GMRES procedure is stopped once the solution of each linear subproblem is unchanged for two iterations in a row. However, since the solution of the solid subproblem is important for the fluid solution, the step size for the fluid subproblem has to adapt to the solution evolution of the solid state. In this context, a total step size is calculated by multiplying one time step with several factors, depending on specific conditions at each solution level. While the adaptation in time is obvious for the fluid problem, the discretization of the shape- and size-parameters in space is a more sophisticated procedure. We observe that the discretization of the shape- and size-parameters results in a uniform mesh over the solution space. This yields a fair competition of the fluid and solid subproblems, which would be too fast otherwise, and a clear improvement in solution quality. In this context, we select"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to about $1.0\\%$ for the scanning frames. Fig.~\\ref{fig:image17} shows the coverages made by the $4$ power level and the $10\\%$ duty cycle schemes over $80$ meters$^2$ down to a sampling density of $10^{-4}$ m. Thus, the calculation is not accurate enough to be considered as representative. Moreover, the $4$ power schemes for the $2.4$ GHz band show that, even though the coverage varies for different solutions, the average coverage makes about $9.3\\%$ worth of difference between the $4$ power and $10\\%$ duty cycle schemes. However, the $4$ power schemes for the $900$ MHz band show that the average coverage difference for different solutions is about $70\\%$ of the power difference. Therefore, we may conclude that the $4$ power schemes are not suitable to evaluate the accuracy effects of power schemes. We can see the effects of duty cycle for the $2.4$ GHz band in Fig.~\\ref{"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that continuity of the zero-radical structure of the Angular Velocity implies that all CMEs move in some direction (not necessarily towards Earth). In other words, a population of CMEs can never completely vanish; some always remain in the solar wind. Furthermore, a complete dismissal of a near-radial trajectory was shown by \\cite{Yang:2016gfa} among observations of near-radial CMEs with respect to their orientation as it responded to evolutional evolution and fusion with an ambient coronal mass element. Similarly, \\cite{Seivertson:2016pba} discussed a CME cross field magnetic field reversal, which if true would imply a much more favorable association with a radial trajectory. Finally, as discussed in \\cite{Hege:2016ila}, most radial CMEs originate from the streamer belt and are therefore characterized by halo alpha particles, which do not survive in the solar wind when traveled out to Earth. The non-radial definition, however, remains highly controversial since it requires modeling of the solar wind inside of "
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015, Millet2020}),  the matrix $K$ can be viewed as a reference frame accompanying the {\\it locally polarized} ACR $\\Sigma_{\\nu}$ according to \\eqref{acrf}. In this perspective, the constraint \\eqref{tangentialconecondition} simply requires that all material units of $\\Sigma_{\\nu}$ remains fixed with respect to $K$ when the particle rotates about the twist axis $\\alpha$. Thus if the spineroid is used to apply a torque to the substrate, the implanted atoms must be aligned in order to {\\it not} break the rotation frame of $K$. This perspective also explains why the matrix $K$ can provide a sufficient constraint to allow one to use classical mechanics to model a spineroid's rotation about the twist axis $\\alpha$. As mentioned above, steel grade steels are used for laboratory experiments \\cite{Sirbu2017}. Although steel is not a magnetic material, it can be proven that the metals used in this category of spineroids have a non-zero value of $k"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f^\\prime(x)$ and $f(x)$ conditions. For the Landweber iteration neither the desired local minimum or the modified $L_2$ error limit is known to be efficient for obtaining the $x_{k+1}$ each time from table entry. However, several strategies have been proposed for efficiently implementing the Landweber iteration~\\cite{Sw13,Un18}. In the original work, the authors of \\cite{Ma01} propose to estimate local errors by taking into account a temporal error account for $k+1$ (see \\cite{Ma01} Section 5.4), and also suggesting projecting the iteration onto a direction in which the algebraic error is of order $L_2=1$. However, this strategy produces several problems, as a varying error condition across a small range of the parameter $k$, and also the possibility of obtaining false local minimums around the entry of the table (the issue of imaginary errors with $k_i\\approx c$ was also discussed in \\cite{Ma01} Section "
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the approximate solution at the point where the vector field is defined, which is only present in the linear case. The complexity of the vector field is independent of the dimension choice. The empirical results are sensitive to initial noise and present the advantage of the proposed mechanism. The proposed mechanism does not require pre-smoothing steps to remove initial noise and it has fewer iterations to reach convergence. The number of iterations is independent of the time-step choice. Compared to the full parameter method \\cite{Deng2018}, which requires additional noisy initial solution and large number of iterations and errors, the proposed subproblem has fewer parameters and we do not need any additional initial solution. Also, the empirical results indicate that the convergence of the subproblem solution is faster and labels for non-projectable components are more accurate than the full parameter method. Although we do not require the initial solution, it is better to avoid introducing too much noise since this can naturally be resolved by the subproblem. However, such a large initial noise may lead to a badly initialized deep learning model, and hence does not benefit from the additional iterations"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the surface $u=\\text{const}$ relative to $\\partial_j \\subset \\partial$, $j=x,y$, is called fundamental solution of the surface tension. The surface tension models the adhesion of highly heterogeneous materials, e.g. dry grains on a thin layer $\\Omega$ \\cite{Castillo2014}. Indeed, in contrast to a regular case, where one has total boundary condition $u=0$ on $\\partial$, the boundary $g_j$ denotes the boundary condition $u=\\text{const}$ on $\\partial_j \\subset \\partial$. Note that due benevolence of coefficients, the finite layer $\\Omega$ contains a large number of grains from a large number of different initial particles. Therefore we want to model a cumulative adhesion on the surface of each grain by setting $b<0$ and identify a threshold $m$ such that grains with $d_j>m$ are considered to be dirty. Even though the surface tension is usually considered as a nonlinear function of the adhesion degree $d_j=r d_k l_{k,"
        }
      ]
    },
    {
      "train_loss": 0.052875,
      "update_counter": 4750,
      "epoch": 153.19354838709677,
      "val_loss": 5.37890625,
      "val_entropy": 0.7666015625,
      "val_ttr": 0.6227213541666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " comparison between our algorithm and the offline versions in selecting contents to rate the user items at different confidence thresholds for the same set of 500 thousand items in Configuration 1 and 2, respectively. Also, we present the results from the algorithm with the low, medium and high values of $\\omega$ in the same configuration in Figure~\\ref{comparison_whole_timeline_configuration_1}.  In the case of Configuration 1, the values of $\\omega$ only affects in recommendations quality at high confidence threshold choices. It is noticeable that the values of $\\omega$ affects in recommendations quality only in the high confidence threshold when the algorithm does not know the contents to be rated. In fact, once the algorithm builds the explicit relation matrix from the partial ratings of the items, the values of $\\omega$ only affects in recommendations quality in a negative way. This means that the values of $\\omega$ compresses the range of the items items items that are not liked by some items and are liked by other items. In other words, the values of $\\omega$ leads to less diversity in the preferences of"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " $K_0 < \\cdots < K_{n_{Fin}}$ in the case when the parameter domain and the meshes are all uniform and equal. We first examine the case in which the number of outer iterations $k$ is fixed. For this value of $k$, the estimates are almost exact until the final mesh $K_n$ is reached, after which they are slightly biased. Increasing the number of outer iterations $k$ to $k = 2$ multi-level iterations per interval reduces the bias but does not completely correct the estimator. One reason for the residual error remaining high is that the parameter continuation problem is solved with a fixed number of 200 degrees of freedom. In the last few iterations, we observe the opposite behavior, with the estimators being highly biased. This is due to large gradients in the parameter space around the local minimum, which lead to large flux residuals and hence small approximation residuals. In this case, we increase the number of continuation iterations to 500 and this reduces the bias and slightly the residual error. Note that a smoothing of the function $\\psi(\\"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method depends on the solid subproblem and the fluid subproblem, which require different strategies. Since motion is continuous in mechanical terms, the fluid subproblem has to be planned carefully for large time steps. In order to achieve large time steps while maintaining accuracy, a strong dependence on the pre-computed empirical fluid data has been utilized. Data are obtained from a precomputation step in which a large number of points in the whole space are used. This large memory requirement is the main drawback of this approach and it prohibits using large space of space-time volumes. In order to overcome this limitation, a new algorithm for subproblem initialization is suggested and implemented in a memory-efficient way using a grid over the mean value of the data. The idea reduces data memory requirement by approximately 500 times compared with the original form in order to support large simulation spaces with large processors teams. The new algorithm is fast and produces high accuracy results with time consumption around one fluid step. This subproblem is designed for large space of space-time volumes and therefore liquid flowing in solid boundaries and embedded solid parts"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": " $1.5$ GHz are reduced by as much as $15\\%$. In addition, the presence of scanning frames in the RF records seems to have no negative effect on the accuracy of the floor estimates taskned to the RF system. In fact, the data shows a reduction of around $3.5\\%$ in the average floor estimate compared to the $2.4$ GHz system. This reduction in average floor is further reduced to $1.75\\%$ by incorporating the information from the low and high RF bands. It is worth mentioning that the high band ( $1.6$ GHz) provides less information about the receiver's location, thus has a direct correlation to the floor-wise estimates. However, by incorporating the high band into the floor estimation algorithm, the average floor selected by the mobile user decreases by $1.75\\%$. As for the data acquisition cost, the following protocol was implemented. Each mobile user is allowed to fetch $15$ MB worth of data, which translates to $681$ RF records ($5x15$ min with $1"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that from the point of view of the global aerodynamic model (ABC model), the solar wind stream is accompanied by an active transport structure called a wind storm. The aerodynamic properties of wind storms are modeled as high turbulence intensity in a calm background solar wind. The global aerodynamic model allows to successfully Regio- correlated with the enhancements of the magnetic field structure along with the stream currents with super-Parker spiral characteristics. These enhancements are well-matched by interactive magnetic field models (CME models). Moreover, it was shown \\citep{Valgushev:2018jg,Valgushev:2019nja} that the global aerodynamic model correctly predicts storm-associated suprathermal solar wind heating, which is subsequently transported outwards and becomes visible as late-time enhancements of coronal temperature. In other words, late-time observations of the solar wind provide 3D spatial information about the internal structure of a CME, which allows one to infer its internal properties. Moreover, it was shown that dynamic pressure plays a very important role in the development"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher1997,Levin1997,Hua2012,Hua2013,Hua2016,Schlaffer2000,Schlaffer2002,Schlaffer2005,Henningson2012,eddolati2015parameterization,Yvovelin2016,Yvovelin2017,Erne2020} and in particular for non-linear regression \\cite{Polak1968,Wood1992, Levin2010,Zimek2016,Hua2016}), the functional data community has put particular focus on the isotropic cone condition \\eqref{isotropicconecondition} for two reasons. First, the isotropic cone holds for many real-world data sets. Second, when the isotropic cone condition is violated (which is by definition for non-axial data), it is typically possible for the data to be better described by a mixture of"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the $f(x)$ and $f'(x)$ values at $x_k$. In OLRP and LP methods the values of such estimates are well defined and accurate. In the Landweber method the iterative procedure is stopped once the norm of the current step solution $x^{(k+1)}$ drops to some predefined threshold. This means that the values of $x^{(k+1)}$ and $Ax^{(k+1)}$ are used as estimates for the values of $x_k$ with respect to the function $g$. In the context of the present discussion on the non-convergence issue, it is important to observe that $x^{(k+1)}$ is obtained by processing the current step solution $x_k$ through a grid matrix $B$ with large size. Thus, in general $x^{(k+1)}$ lies far from the actual solution $x^*$, hence the chances for $B(x^{(k+1)})-x_k$ to not contain $x_{*}$. If the above occurrence happens, then the effective solution $x_{*}$ is poorly"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the solution to be locally in $C([0,T])$ even the initial solution is in $C^0(\\mathbb{R}^{d})$ or $C^0(\\mathbb{R}^{d},U)$. See Theorem 2.3 of \\cite{Dai-Kang-Turan-Agtasov-2018}. In particular, the function sequence $\\{u_{k,\\lambda_{k}}\\}_{k=1,2,\\dots}$ is shown to converge to some solution $u_{\\lambda}$ of theNLSFC~$\\eqref{NLS-FC}$ with $\\lambda=\\lambda_{\\infty}<\\infty$. The convergence does not be globally consistent. However, in practice, we only need to solve the problem of $u_{k,\\lambda_{k}}$ in the time interval $[0,T_{k}]$, where $T_{k}$ is a monotonically increasing sequence of positive integers $\\{T_{k}\\}_{k=1,2,\\dots}$, and the number $\\{T_{k}\\}_{k=1,2,\\dots}$ tends to $\\infty$ as $k\\"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the gradient of the solution in different directions. From the existence of strong solutions in $\\mathcal{D}^X_T$, to their weak convergence in $\\mathcal{D}^X_T$, to the characterization of hidden regularity results and to the stability under Sobolev embeddings, we develop a new strategy to get gradient flows in $\\Omega$. We define a $L_2(A)\\rightarrow L_2(B)$ isomorphism $E_{A,B} : L_2(A) \\rightarrow B^{\\frac{N}{2}+1}_{L_2(A)}$ with the Grad-Calderon estimate. We use it to model the gradient in a space with unknown basis. We define the gradient in $L_2(A)$ as the operator $E_{A,B}(u) : = \\int_A u \\, dA$ defined in the\u6977\u4f53, which is the gradient in $L_2(A)$ defined w.r.t. the basis used to define it. We then define a constraint on the basis of the elliptic operator $A$. Finally,"
        }
      ]
    },
    {
      "train_loss": 0.043798828125,
      "update_counter": 5000,
      "epoch": 161.25806451612902,
      "val_loss": 6.30859375,
      "val_entropy": 0.677734375,
      "val_ttr": 0.6136067708333334,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison between our multiple choice event sequence recognition method and the three step method in detecting event sequences for the same three scenes under different configurations. We can observe that our method performs better than the two-step method in detecting event sequences for all configurations in both Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2}. In addition, it is clear that our method performs best when it uses fewer layers in detecting event sequences. For example, it has seven hundred thousand parameters and perform well under the \\textit{whole_body} configuration. In contrast, the two-step method performs according to the number of layers it uses for detecting event sequences. For example, it performs its best with more than one thousand two hundred and fifty thousand parameters under the \\textit{whole_body} configuration. Moreover, our method does not require such different biologically implausible amounts of training parameters for detecting event sequences when it uses as little as three layers. In summary, we conclude that our method performs better than the two-step method for both performance"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $2\\times 2$ square. In case of $k=1$ we apply standard error estimation through the interpolation. For $L^2$ norm we use the interpolation with respect to the finite dimensional space $V_h=\\{u_1,u_2\\}$ while for $H^1$ norm we approximate the variance of the approximation's gradient by projecting it onto the finite dimensional space $V_h=\\{u_1,u_2\\}$. In the last two lines of the table we present the maximal and average error for $L^2$ and $H^1$ norms over the mesh $K_i$ for the third iteration. We can see high accuracy of the a posteriori estimator even in the simple domain. The gradients of the error estimated through the error interpolation and the estimation estimated through the variance of the approximation's gradient have very close values and in most of the cases they are equal. The average error over the mesh is slightly lower than the global error. The error in the optimal parameters estimation is very low even in the simplest domain."
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In this manner, the\u671f\u671b future flux provided by the fluid is computed at each time step of the sub-domain, which is used as a ``run-time parameter'' by the solid sub-domain during the corresponding time step. For the fluid solver, we have chosen the PISO algorithm available from the COMSOL Multiphysics \\cite{PISO} package, which is fast and accurate for boundary-driven flows. Similarly to the PECV \\cite{PECV2} package, COMSOL Multiphysics offers only simplified solutions for the solid behavior. However, by adapting strongly from the traditional ``rigid'' approach \\cite{RigidApproach} to the more efficient virtual work (RW) approach introduced in the first part of this chapter, the solid behavior can be modeled fairly accurately. For the fluid flow, we follow the case with constant inflow and out flow, presented in Figure \\ref{Fig7}. We have fixed the boundary conditions on the wall (condition 1) and on the output boundary (condition 2); all else is zero. For the solid behavior, we"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% in $2.4$ GHz on a single floor if $N_{PCK}$ is small. The relative error for a scanning frame on two floors is recorded as $1.4\\%$. The next stage after using the above refinement steps is to correct the center frequency of the RF channels. We use a frequency-frequency correlation across frames, where a strong correlation is identified over the first 20 frames in a receiver chain. Using the corresponding top 10 frequencies in that receiver chain, we find that the correlation between a set of receivers, which change by 1 or 2, has a stronger correlation than the ones changing more than 2. So, we group the receivers into clusters of 1 or 2, and create new RF channels for those clusters. See Appendix \\ref{app:freqCorrel} for details. The number of these clusters per receiver chain depends on the contents of the cache. Finally, we create a map between the old and new RF channels, and this map is learned using Aligment 3. This map is learned for each cache"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that with increasing heliocentric distance of a CME, its the acceleration decreases and its velocity becomes closer to the solar wind velocity. This means that the traditional equilibrium version of the CME requires that the velocity difference between the S/C and the CME slows down and becomes almost zero. This situation is unrealistic, especially for very long CMEs with their activation in the solar wind magneto-rotation zone \\citep{2016ApJ...828...51S,2019ApJ...879...47D}. Moreover, the interaction between a small CME and a soft wind skirted by an astronaut spacecraft, which represents a small part of the total amount of CMEs out there, is expected to push the S/C into the streamer belt and into a high-density, high-speed environment. Thus, the traditional hypothesis stating that the active region will become a deflationary structure and that the flux rope will pop and explode seems to be reasonable. Furthermore, the data presented in \\cite{2016ApJ...828..."
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015,Hock2020}),  the implicit assumption that the functional $f$ is defined in (\\ref{generalelasticproblem}) is that of a ``standard'' function, i.e. one that is defined in a Banach space (in this case, since $\\mathfrak{X}_{\\mathrm{std}}$ is the underlying space, it depends also on a Banach algebra assumption on $f$). In order to relieve this implicit assumption, we have recently (see \\cite{Hock2019}) defined a general \\emph{elliptic problem} for orthotropic structures, which holds for \\emph{any continuous scalar product} $\\langle \\cdot |\\cdot \\rangle$ (including the standard one) operating on a Banach space $\\mathcal{X}$. The corresponding abstract boundary value problem is given by $f \\in \\mathcal{G}$ denoting the set of ``solutions of the boundary value problem to the scalar elliptic equation $f\\left(\\langle \\textbf{S} |\\textbf{S} \\rangle \\right)=g$, where $g$ denotes the"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f^\\prime(x^*)$ and $f(x^*)$, which are obtained by a fast starting iteration process based on the BICP method. The first landweber iteration may reach this local minimum and then $d_1=1$, but for the termination of this local minimum we need to perform  $d_2$ iterations and therefore the total steps is $1+2m_1+2m_2$, where $m_i$ is the number of landweber iteration for each step. Since either $m_1$ or $m_2$ should be defined as $2m_1+2m_2=200$, we define it as the number of steps which is equivalent to  $d_0=100$, which means that we use auxiliary estimates for $f^\\prime(x^*)$ in the first landweber iteration. There are several variations of the method such as  we can also define  $d_0$ as a combination of these two ideas, e.g., $d_0=1"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the sequence $k-th$ column vectors $X_{k,1}^\\top,\\ldots,X_{k,S_-1}^\\top$ (or equivalently, of the intermediate solutions at iteration $k$ compared with $k-1$ stage) to exact destinations in a neighborhood of zero. For $X$ far from the zero vector, it may be exactly matched the solution $X$ given $K$ and $y$, even some $k$ beyond $K$, but we prove the convergence only in a neighborhood of zero (of size less than $S$ including the first $K$ iterations). We prove that there is global convergence, but the convergence rate is not universal as it may depend on the implicit function theorem-like function at non-zero entries of $X$. In other words, we prove that $X$ converges to a solution of $Y = X^{\\top} R_2$ where $R_2$ is a possibly non-diagonal positive semi-definite matrix of size only $S$ times the iteration number $k$ when the implicit function theorem-"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the surface $p$ which is now assumed non-smooth. We remark that although the solution is defined as a function on the Laplace transform of $v$, it takes values in $C^0(\\Omega, \\mathcal{H})$ (see Definition~\\ref{def:seal}) where $\\mathcal{H}$ is a Hilbert space of functions measuring the energy of the vector $v$. The construction of the Laplace transform depends on the perturbation of the simple potential $\\tilde{g}$, that is, $g_j = p \\cdot \\tilde{g}_j$, with $\\tilde{g}_j \\in \\mathcal{V}$ ranging over some Banach space of smooth potentials. Such a behavior is actually obtained when $p$ is a rough surface, and one approximates it by smooth sub- surfaces which are multiplied by $\\tilde{g}_j$. Of course, in reality $\\mathcal{V}$ will be $\\mathcal{H}$ in our example, but in more general settings one can get formal proofs of consistency of the Laplace transform if the Banach spaces $\\mathcal{V}$ and $\\mathcal{"
        }
      ]
    },
    {
      "train_loss": 0.03335546875,
      "update_counter": 5250,
      "epoch": 169.32258064516128,
      "val_loss": 6.599609375,
      "val_entropy": 0.64990234375,
      "val_ttr": 0.5550130208333333,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 10) (22, 10) (23, 10) (24, 10) (25, 10) (26, 10) (27, 10) (28, 10) (29, 10) (30, 10) (31, 10) (32, 10) (33, 10"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison between our theoretical predictions and the simulation results for the same two configurations as in Figure~\\ref{comparison_whole_timeline_configuration_0}, i.e.,~(a) the configuration with $K=2$ hidden layers and~(b) the configuration with $K=3$ hidden layers. In both cases, the curve associated to theoretical predictions are those obtained using parameter values $\\{b_1^*, b_2^*, d_1^*, d_2^*\\}$ for the three hidden layers. For instance, in Figure~\\ref{comparison_whole_timeline_configuration_1}, the theoretical predictions for the configuration with $K=2$ hidden layers are obtained using $\\{b_1^*, b_2^*, d_1^*, d_2^*\\}$ whose $b_1$ and $b_2$ are given in Eq.~\\eqref{eq:bk}, whereas for $\\{d_1^*, d_2^*\\}$ we used the solution given in Eq.~\\eqref{eq: dits}. Thus, from what we can see in both figures, our theoretical predictions are very accurate as"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $2\\times 2$ square. In Case A we use $k$-th order MINEF2 on both sides while in Case B we apply $k$-th order TNEFA. We set the initial data $\\sigma = \\sigma_0$ to make the flows converge to some initial error. The error $\\error_{\\rm apexr}$ shows the error estimated by the postprocessing method, while the error $\\error_{\\rm apexr,X}$ shows the estimation of the error $X$ considered as true, where $X$ one of $\\{,\\ textual, numerical, Einstein\\}$. The error $\\error_{\\rm apexr,X}$ shows the error estimated before applying the postprocessing method. The Einstein error is computed according to~\\eqref{Einstein_error_square}. Case RA1-TL denotes MINEF2 on the right-handed side and TNEFA on the left-handed side computed by the method of cases in Section~\\ref{sec:a_posteriori_evaluation}. Cases RA2-TL and RA2-MINEF2 on the right-handed"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method will ultimately be determined by the elements of the global solution with the least accuracy, i.e., the fluid solution in this case. However, sufficiently accurate solutions for arbitrary complex boundaries and flows can be achieved by increasing the grid sizes of the fluid subproblem. This increase of the grid sizes of the fluid problem does not necessarily have to be consecutive large steps, but a step back correction of the solid motion as proposed in Section~\\ref{sec:PSTsection} can greatly improve the accuracy of the overall solution without increasing the grid sizes of the fluid subproblem further. The overall total errors for the considered example including surface contact constraints between the solid object and the environment (walls and ground) are depicted in Figure \\ref{fig:PSTtotalError} in Panel (1) vs. Figure \\ref{fig: \\ref{sec:GCRsubsection}In addition, the total computation times for the overall control as shown in Panel (1) are significantly shorter than the cases of Figure \\ref{fig:GCRAhodic}, where a fast computation scheme"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% in $2.4$ GHz on a single floor if dynamic routing is enabled (with $M=1$ and $C=2$) and only about 4\\% reduction in scanning frames. Moreover, the number of packets per frame needs to be tuned so that the errors are further reduced. We leave this study as an ongoing work, since the discussion of the overhead of dynamic routing is beyond the scope of this paper. The number of floors, the channel variations, the error rates, and the need for feedback from the RF module can all jointly decide the need for the \\textit{roster} change. Since a scanning frame is divided into several select-match sub-frames, the total number of packets that needs to be sent to change the line of view is significant. If the selection of what to scan is too frequent, this very high number of packets can cause very high overhead in the low-rate packets that have more application. Thus, the frequency of change of the \\textit{roster} should be chosen so that, (1) the errors are reduced, (2"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that continuation of the structure seen in the F-maps would also be observed at larger distances from the sun, and therefore the structure of the CME does not change suddenly when it exits the solar corona but continues for several days beyond $\\sim1~AU$. Custodi et al. (\\cite{custodi2016cmes}) confirmed this continuing structure of the CME using data from the {\\emph{STEREO}}-A mission and argued that the CME-associated streamer belt was essentially part of the CME, suggesting that the streamer belt may also have extended radial duration as well. Valgushev and Jones (\\cite{valgushev2016choices}) chose a statistical approach to test various models of the radial evolution of CMEs and related plasma enhancements. Their result suggested that the longer duration model fit the data better, although the result is rather sensitive to assumed models of the plasma enhancements. Other authors have also looked to explain the streamer belt associated with the CME with a number of possibilities, including that it is produts of the compression effect of the SBO"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015,Hansen2018}),  the constraint matrix $L$ in (\\eqref{conjdef}) can in general not be constructed directly from the users $i$ and videos $j$.  Further, since the matrices $b$ and $H$ need to be consistent with respect to the sets of indices $\\{i_j\\}$, $\\{j_i\\}$, it becomes essential to first establish this relation, and then construct the constraint matrix $L$ for computing the bound in (\\eqref{conjbound}).  For the particular indexing scheme of equal interval sizes, the matrix $H$ can be simplified considerably with respect to the generally formulated version in (\\ref{generalh}). In the equal interval scheme, the $(i,j)$ entry of $H$ is simply the user with $i$ videos $\\{u_j\\}_{j=1}^i$ playing the video with subindex $j$:$H_{i,j} = 1$. Thus, the constraint matrix $L$ is solely determined by the permutation matrix $P$ and the user vector $u"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f^\\prime(x^*)$ and $f(x^*)$, we notice that for $x^{(i+1)} = x^{(i,1)}$ the second control objective is achieved at perfect control steps, while for $x^{(i+1)} = x^{(i,2)}$ the second control objective is achieved at inefficient controls steps. The main difference between the two update steps is that $x^{(i+1)}$ obtained by using the first update step is connected to the target $y$ by the linear equation $x^{(i+1)} = x^{(i,1)}$, while $x^{(i+1)}=x^{(i,2)}$ is connected to the target $y$ by the linear equation $x^{(i,2)} = x^{(i+1)}+du_i$. Thus the second control objective is not achieved due to noise and cannot be implemented in practice. However, this does not necessarily mean that the optimal control sequence $x^{(i)}$ should avoid updating to $x^{(i+1)}=x^{(i,2)}$ since doing so would stop progress"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of each individual sequence and then conclude the global convergence of the overall method by a global in-and-out argument \\cite{iv,ilo}. On the other hand, the local regularity of solution, which is required in the classic convergence theory of linear methods, is shown to have a global manifestation \\cite{ilo,ign}. We also note that various active mechanisms have been developed later to address the local convergence in the nonlinear context \\cite{ali,arf,ank}. However, these active strategies still require to \\emph{project} the iterative solution onto the space of solution vectors of the linear problem. The established convergence theories of linear methods and the nonlinear method using these active mechanisms combine to suggest the local convergence only. The complete proof of the convenience of projectivity is given in the next item. In contrast to the classic linear methods, the solution of the nonlinear method converges to an arbitrary small error in the overall sense by the analysis and approximation theory of underdetermination in \\cite{ign}.  This implies that one can choose the basis of solution vectors to be as large as needed, and the method solution"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the surface $u$ with smooth boundaries as in (1.~a), (1.~b) respectively. The layer terms $g_j=\\mathcal{G}_{d_j}$ imply that the solutions outside and near the boundary of $\\Omega$ are fixed, and the unknowns are only associated with the boundary. Despite of the simple boundary condition, for non-smooth boundaries $ \\Omega_{ij} $ is the product of a smooth boundary and a small smooth perturbation, there is also no intrinsic smoothness of the perturbation. Hence, we can use $ \\Omega_i $ to model the boundaries of solutions $u_i$ that are not smooth enough. However, the boundaries in $ \\Omega_i $ are fixed and not modifiable. We realize the limitation of having smooth boundaries by using a perturbation of a smooth boundary $\\Omega_{0}$ with respect to $ \\Omega_i $. Let $\\Omega_{1}$ to be the approximate perturbation, and $\\Omega_{0}^{\\sharp}$ to be the exact boundary. Then, $\\Omega_{1}\\subset \\Omega_{0}^{\\sharp}$ with"
        }
      ]
    },
    {
      "train_loss": 0.03260205078125,
      "update_counter": 5500,
      "epoch": 177.38709677419354,
      "val_loss": 6.72265625,
      "val_entropy": 0.6397705078125,
      "val_ttr": 0.6028645833333334,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 10) (22, 10) (23, 10) (24, 10) (25, 10) (26, 10) (27, 10) (28, 10) (29, 10) (30, 10) (31, 10) (32, 10) (33, 10"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison between our configuration I and configuration II in the whole timeline. The agreement between the configurations I and II in event-sorted timelines is about 0.75. The average difference between the individual events in the two sorted timelines is about 0.06. In contrast to configuration I, configuration II puts different expressions into different parts of the note. For instance, note beginnings are reserved for prevalent expressions while dominant expressions occupy its latter parts. The horizontal green lines indicate the notes that contain the two dominant expressions. As can be seen in Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2}, notes contain more than one dominant expression and thus different parts of notes are less appropriate for individual notes. Thus, a temporal partition is needed to properly access notes corresponding to different dominant expressions. For this purpose, the keyboard contains a temporal partition and it is used to select the note that contains the corresponding dominant expression. The green lines in Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $2\\times 2$ square. In Case~$1$ we take $\\alpha_h=h$ while in Case~$2$ we choose $\\alpha_h=\\max_{1\\leq i\\leq 4} \\frac{\\lambda_i}{L^2} h = \\frac{0.75}{120} h = 0.006$ (note that $\\lambda_1 \\approx 1$ so that $\\alpha_h$ approximates $L_t^2 + \\epsilon$ well). In both cases the frequency analysis is performed across the boundary between the two first rows and into the third row. In Case~$1$ the estimated error resembles a low pass filter while the actual error is quadratic in time. This might be explained by the shape of the element boundaries across the gap; on a triangular element the mean is a good approximation for interval-valued data, but on an element with a gap the variance can be order of magnitude larger than the mean. For Case~$2$ the estimated error is again high but the actual"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method will ultimately be determined by the elements of the global solution with the least accuracy, i.e., the fluid solution in this case. \\emph{A priori} there are two important aspects that limit the accuracy for the considered problem. Firstly, the fluid solution obtains its limit value for the function $f_{\\mathrm{iss}}(x) = \\frac{1}{4} x^2 - \\frac{1}{6} x^3 + \\mathcal{O}(x^4)$ for $x \\geq 1$. This limitation is incorporated through the user-defined function $g_{\\mathrm{iss}}(y;t_s,t_d)$ in \\eqref{eq:iss_conservative}, where $y = f_{\\mathrm{iss}}(x)$. For sufficiently small discretization steps, $t_s$ should be chosen such that $g_{\\mathrm{iss}}(t_s,t_d) < 0.01$. The solid manipulation should be done such that the solid solution is bounded with $|g_{\\mathrm{sol}}("
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": " $1.5$\\% on the 2nd floor and up to $3.5$ on the 4th floor. In addition, the addition of scanning images on the bottom floor reduces the average distance between two lines on this floor from $2.58$ to \\(1.24\\) mm. However, the identification of individual objects, especially for the mobile objects such as mobile trucks, needs the development of a separate protocol that can identify and track each object individually. \\cite{IoT2020TeMBS128A} works on the same concept, but by using the ascending numbering method for the objects. If the items are changed frequently on the working floor, the error rate will be high, and the low resolution of the image processor will also increase the error. To combat this problem, we can, without difficulty, send a still image to the mobile SBU, save the information about the objects and then, the RBU can verify the accuracy of the data by comparing it with the paper register. In this way, not only can the time for scanning"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that continuation of the structure seen in Fig.~\\ref{fig:fig00001}, meaning a circularity of the same value $R_p$ lasting for a few years, is caused by an evolution of the system towards a non-classical Alfv\\'enic state  of the plasma before a transition into an impulsive state with the ejection of bursts. The conclusion of this work is that such states occur frequently in these corotating streams eruption sites, and that the process of liberation is nontrivial and involves an evolution of an internal structure. Particularly, it was shown that the diameter of the rotation zone (ring) significantly changes over time, with a decrease in circulation \\citep{Valgushev:2016cstar}.  In addition, gradual approach of the median magnetic field to the Parker spiral prior to the eruption of the CME was observed. In a paper \\cite{2020ApJS..246...23C}, the formation of sub-CME structures and their evolution into a CME followed the same non-linear evolution"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015,Hansen2018}), the lower dimensional classical problem of {\\em soft thresholding} has served as a main motivation. Mathies \\cite{Mathies1977} employs this procedure to obtain a soft thresholding operator and explains its remarkable properties. Other applications of this type include the paper of Gakhman and Mayada \\cite{Gakhman2013}, wherein the convolutional operator for $\\mathbb{H}$ of index $0$ is sought. Furthermore, courtaldian metrics and the associated convolutional operator have shown up in the theory of particle models of polymers \\cite{Hansen2018}, as well as in molecular biology \\cite{License2013,Matos2017}. Finally, we would like to mention the recent interest towards courtaldian metrics and convolutional operators on Banach spaces, such as in \\cite{Daws2019,Edelburg2019}. There, the relationship of courtaldian metrics to the spectral graph theory of \\cite{Daws20"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f^\\prime(x^*)$ and $f(x^*)$, we observe that the criteria used to determine the local minimum in~\\eqref{eq:local_minimum_condition} generates a different initial value $x_0^*$ for the simulation of the Landweber method, which might yield a different global conclusion about a global solution of~\\eqref{eq:par_aff_opt_problem}. In the non-singular case, the second column of~\\eqref{eq:parent_function} shows the value of $x$ such that $Ax=b$ is identified as the true solution. We emphasize that for $x\\mapsto x-y$, the gradient method gives solution $x(1)$ at distance $1$ from $x$, while $x(0)$ closely matches $x^*$ even at the simulated level. We therefore expect that $x_0^*$ closely approximates $x^*$, given the appropriate regularization. However, due to a phenomenon seen in recent simulations, $x_1$ slightly violates the constraint and is then terminated by the regularization, generation of $x"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of each individual sequence and then conclude the global convergence status of the overall method only under certain conditions, for example, parameter values are all positive, or some convergence step is sparse et cetera. \\cite{odlykal2014numerical,cheng_numerical_2017,bailey_numerical_2019} present such local convergence results. The convergence studies in the linear case are also often presented in terms of individual sequences rather than the overall the system. \\cite{zeng_algorithm_2019,yang_algorithm_2019,odlykal_all_2019} prove the local convergence of the overall method to no future singularity point(s) under some conditions(e.g. parameter values are all positive). The recent paper \\cite{zeng_algorithm_2021} does not use the linear-nonlinear-linear framework there but still presents the local convergence studies under certain conditions. In contrast, this paper presents a proof of global convergence in the nonlinear case with a two-step demonstration on local"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the fluid thickness at boundary $b_j$. The boundary conditions are defined by the boundary conditions \\eqref{eq:bc_hop} and \\eqref{eq:bc_source}, and the coefficients depend on the parameters $x=\\langle \\mathbf{x}\\rangle$ \u2013 the mean position of the point $\\mathbf{x}\\in\\Omega$, $ \\zeta=\\langle \\mathbf{n}\\rangle$ \u2013 the mean direction of the normal $\\mathbf{n}$ to the boundary, and $c=c(x,\\zeta)$ which depends on the model and is defined in \\cite[Remark 2]{YarilP18}. The main problem is defined in the following way: find $u \\in \\mathcal{V}$ where $\\mathcal{V} \\subset \\mathcal{H}_0^1( \\Omega; \\mathbb{M}^c_\\infty)$ is the space of functions containing the smooth functions which vanish outside of $\\Omega$. The space $\\mathcal{H}_0^1( \\Omega; \\mathbb{M}^c_\\infty)$ is"
        }
      ]
    },
    {
      "train_loss": 0.032244140625,
      "update_counter": 5750,
      "epoch": 185.4516129032258,
      "val_loss": 6.8359375,
      "val_entropy": 0.6285400390625,
      "val_ttr": 0.5921223958333334,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 10) (22, 10) (23, 10) (24, 10) (25, 10) (26, 10) (27, 10) (28, 10) (29, 10) (30, 10) (31, 10) (32, 10) (33, 10"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison between our theoretical predictions and simulation results for the scaling constraints $\\lambda_1 = \\lambda_2 = \\lambda_{9/2}$ and $\\lambda_1 = \\lambda_2 = \\lambda_{5/2}$ in Table~\\ref{tables} for the first and the second configuration, respectively. The agreements between the theoretical predictions based on the approximation $z-1.2 = - \\lambda_1 - \\lambda_2$ and the simulations are shown in Figure~\\ref{comparison_whole_timeline_configuration_1}(c) and~\\ref{comparison_whole_timeline_configuration_2}(c), which suggests that we have closed the gap both for the spacing distributions and the scaling functions. In the Figure~\\ref{comparison_whole_timeline_configuration_1}(d) and~\\ref{comparison_whole_timeline_configuration_2}(d), the distributions of the intervals are shown, and it is clear that our theoretical predictions are well agreed with the simulations for both configurations. Note that in the case of configuration~\\ref{comparison_whole_timeline"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " over the unit $2\\times 2$ square. In Case A we use $k$-th order MINEF2 on both domains, while in Case B we use $k$-th order TNEFA. We set the preference parameter to $\\lambda = 10$ and the only difference between the two cases is that in Case B the volumes of different meshes are the same, while in Case A they precede by a factor of $1/k$. From the last two rows of the table one can see that even if the shape of the domains is fixed, the error of fluid motion depends on the preference parameter $\\lambda$, and also on the orientation of the grids. On a grid with edge length $0.5$ we observed estimators $E=0.97$, $D=0.03$, $A=9.8\\cdot 10^{-4}$ with a larger error for smaller $\\lambda$. On the other hand, on grids with edge length $0.25$ we had $E=0.999$, $D=10^{-"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method will ultimately be determined by the elements of the global solution in conjunction with the local solutions of fluid and solid. We illustrate the effects of the fluid solid coupling and solid precision on the solution in Fig.~\\ref{FIG:analysis_fluid_solid_precision}. Here we use the coarse grid case with $\\kappa=0.05$ and examine the evolution of the fluid perturbation as a function of time due to the solid evolution. The solution is shown for waves corresponding to the solid snapshots from Figure~\\ref{FIG:initial_conditions} taken after $T_{\\rm s}=300$ years (the last time step). The green top panel shows the perturbation corresponding to wave number $\\mathrm{k}_3=\\($0.093\\,\\mathrm{pMpc}$, 0.081\\,\\mathrm{pMpc}$, 0.068\\,\\mathrm{pMpc}$, 0.055\\,\\mathrm{pMpc}$, 0.044\\,\\mathrm{pMpc}$, 0.037\\,\\mathrm{pMpc"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": " $1.44\\%$ down to \\(1.03\\%$ \\cite{pan2019automatic}. The results in the previous section showed that the scanning method yields less errors on the same floor compared to the original method. This is shown in Fig.~\\ref{fig:plot11}, where the errors on the \\(4th\\) floor are compared. Indeed, for the original method the method starts \\(1.65\\%\\) on the \\(4th\\) floor and increases \\(3.65\\%\\) after scanning. In comparison, the method that changes the starting state on the bottom edge of the floor (i.e., stage 2) starts with \\(1.26\\%\\) errors and increases to \\(1.74\\%\\). The difference in errors between the \\(2\\) methods is \\text{mean}\\left(\\frac{(1-\\pi_{e})}{\\pi_{e}}\\right) = 0.018 which can be considered as the maximum error that the method with scanning can have on the \\(4th\\) floor. In contrast, the original method"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it was shown that continuation of the structure seen in the F-maps would also continue to evolve and attach to the CME, an equatorial coronal hole, leading to the formation of a long, evolutionary chain function that connects the CME before and after the crossing. This continuous stream of corotating plasma will naturally continue to flow in the solar wind for as long as the channel remains open, which in this case was terminated when the CME covered the channel. Thus, the options of either an equatorial eruption or an axial eruption are both likely, with supporting evidence against an equatorial eruption \\cite{Wang:2010bsa}. The likely axial eruption occurs either through Open Field regions or through coronal hole regions, with the former being the most likely according to the analysis of \\cite{Zhang:2013pca} and construction of a modeling model of a CME through an Open Field region, termed COLA, which will be discussed in detail in Paper 2. Since such a model has not been published yet, the parameters of the event will be supported by an alternative modeling method using magnetic"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015,Hansen2018}),  the constraint matrix $L(r)$ encountered in \\eqref{ParKovalenkolevelset} is ill defined for two main reasons. For one, there may not exist a basis of degree $d-1$ in which the matrix is positive definite. Furthermore, as the authors of \\cite{Hansen2018} point out, directly computing a basis of degree $d-1$ is in general highly problematic. For one, the number of functions that need to be mixed to represent the level set of degree $d-1$ into a smaller number of functions (say three) is $\\frac{d!}{9}\\approx\\mathcal{O}(10^5)$ and thus one can use only a small basis. However, as a basis of degree $d-1$ is likely to be ill defined, this approach would not be efficient. Instead, the authors of \\cite{Hansen2018} suggest using a basis of degree $d-2$ with additional functions to handle"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f^\\prime(x^*)$ and $f(x^*)$, we notice that for $x^{(i+1)} < \\bar$ the latter is obtained by assuming that $f(x^{(i+1)}) = \\bar$, while for $x^{(i+1)} \\geq \\bar$ the effective value is obtained by a simple interpolation. In this case the latter value is also affected by the noise, and if it is greater that its true value (that is, if $f(x^{(i+1)}) > \\bar$, where the arrow goes from) the constructed iterative solution will be contaminated by noise. To prevent this problem we from updating $x^{(i+1)}$ to $x^{(i+1+) } = \\frac1b (g(x^{(i)}) + \\eta_i)$, where $b$ is the number of iterations so far. Moreover, if $b$ is not small (that is if the noise is not large) $x^{(1)}$ will be generally not very good starting point for the search, we can fix this"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of each individual sequence, and one can neither guarantee the convergence of the differentiation sequence $D^k f \\direction\"\u9006\"$ x or that of the objective function. However, we do integrate the evidence that suggests that $D^{k+1} f \\direction\"\u9006\"$ x sequences do converge locally in all directions. For the $x \\direction\"\u9006\"$ D^k f case, we have proven local convergence in theorems  \\ref{local_solution} and  \\ref{local_solution_error} (see Section \\ref{numerical_experiments}). Furthermore,  we have established the convergence frameworks of $directional$ local convergence in integrals \\ref{integral_solution} and  \\ref{integral_solution_error} (see Section \\ref{numerical_experiments}). Indicates from the theorems, we have the $x \\direction\"plus\"$ D^k f and $d\"direction\"plus\"$ D^k f  convergence frameworks for convenience. For convenience, the paranthesis in the theorems can be replaced with any direction(s) according to the actual usage"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the fluid thickness at boundary $b_{d_j}$, $j=1,\\pm$, where $b_{d_j}=\\partial \\Omega_{d_j}$.  The boundary conditions are defined at $\\left.\\Omega\\right.|_{d_j}=\\Omega_{d_j}$, $j=1,\\pm$. We assume that $\\Omega$ is filled with a non-vacuum, homogeneous and incompressible fluid $\\mathcal{E}\\left(\\Omega\\right)$, and also suppose that the ambient space is vacuum thus making the boundary of $\\Omega$ attractive. Due to ambient attraction, it happens that very small fluid thickness at the boundary $b_{d_j}$, $j=1,\\pm$, is attracted to very large value. Thus, it is necessary to overcome the ambient attraction to have a small fluid thickness at the boundary $b_{d_j}$, $j=1,\\pm$. We consider the fluid to be incompressible non-Darcy flow that is characterized by an memory function $\\psi(\\tau)$ (relaxation rate) such that $\\Omega_{d_j}$"
        }
      ]
    },
    {
      "train_loss": 0.0486103515625,
      "update_counter": 6000,
      "epoch": 193.51612903225808,
      "val_loss": 5.482421875,
      "val_entropy": 0.73193359375,
      "val_ttr": 0.6012369791666666,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 10) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the maximum reduction of the CFP conversion efficiency for the two configurations. We see that in the case of the layered TMLs the maximum reduction of the efficiency is 9\\% at the maximum reduction of the absorption efficiency, i.e., 22\\%. This value is in a good agreement with our prediction of 22\\% in \\eqref{eq:eq0-4}. In the case of the structured TMLs the maximum reduction of the efficiency is 20\\% at the maximum reduction of the absorption efficiency, i.e., 44\\%. This value is also in a good agreement with our prediction of 20\\% in \\eqref{eq:eq0-5}. These values demonstrate that both designs realize a effective absorption. The maximum reduction of the absorption efficiency has been moved towards lower wavelengths, i.e., 52.3\\% and 59\\% for the layered TMLs and the structured TMLs, respectively. This move is in a good agreement with our prediction of 50\\% in \\eqref{eq:eq0"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for $in = sin(2\\pi t)$ with $t \\in [0,1]$ and $out = sin(2\\pi t) + 0.5\\,sin(2\\pi t) \\, (t \\in [0,1])$ (so $in = out$ on $[0,1]$ and $in \\neq out$ on $[0,1]$, see Table~\\ref{tableheader}). The grid size $\\Delta x = 0.05$ is sufficient for all $N \\leq 12$, while for $N = 13$ it is not sufficient and the error estimator is unreliable. For $Inertia = 0.0$, we have $A = 0$ and $P = \\mathcal{I}$ so that the error estimator is exact in a priori and in a posteriori fashion for $in = in_N$ and $in = out$ but not for $in = in_N$ and $out = out_N$. For $N_K \\in \\{1, \\ldots, "
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In this section, we consider a multirate scheme for both problems, however focusing on the communication by exploiting the new $2$-way data flow between fluid and solid. In this approach, we require that the communication among the central processor (P${}_{0}$) is directly done with the subprocessors (P${}_{i} = {1 - i}$). \\emph{Thus, each subproblem is assigned with a different subset of the overall communication plan.} Thus, we observe that while fluid iterates its ``fluid level'' (i.e., the solid phase is fixed), communication among P${}_{0}$ and subprocessors (P${}_{1 - i}$) is executed. In this mode (referred to as in Eq.~\\eqref{eq:time_solid_fluid_communication_multirate}), the communication load is independent of the solid advancement mode (FD or IP), which we illustrate in Section~\\ref{sec:experimental_comparison}. In contrast to the communication strategy in Eq.~\\eqref{eq:time_solid_communication_serial}, which is designed for the solid advancement mode"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10 \\% on the top floor in $2.4$ GHz, scanning improves them to an 8 \\% average. Thus, in Fig. \\ref{fig:image9}, the red lines depict the average face locations for the fully covered floors using the fixed frames while the blue lines for the scanning frames. As we can see, the face locations on the top floor change dramatically by using the scanning frames, and the average floor-wise registration errors reduce from 10\\% to 8\\%. We can also observe that the lower floors benefit from the scanning frames as the face-to-face meetings tend to occur on different parts of the building. In addition, the average motion of the face frames on the top floor is 0.67 while the lower floors are 0.96. Thus, the sequential ordering of the frames on each floor tends to be more effective for the lower floors as the faces exhibit more motion. In addition, as shown in the Appendix, the minimum average distance between the frames on each floor is 1.6 meters in the $2.4$ GHz"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "p5448} it is shown that the dynamical analysis is more consistent with the non-spherical shape of the CME, where the radius $R$ of the primary cone should be chosen as an average value of the {\\`a}reometry data in \\cite{Horne:1979p5744} or as a factor of the initial axial ratio in \\cite{Deheuvels:2004p5155}. In either case, the height $H$ of the primary CME can be considered as the  average SAPV height of the event, measured in hours; the velocity difference between the S/C and the CME, w.r.t. to the Sun, is also an average value , measured in m/s. In \\cite{Valgushev:2015p5448}, this value has been 0.187 m/s. The height and velocity difference can be extended to higher altitude if the distances of the S/C to its parent and to the CME, as well"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2012,Milanovic2012,Gatis2014,Fonlena2015}, and the discussion in \\cite{Gatis2015}), the target function $F(t)$ must satisfy certain conditions for instance in order to apply the method to rates of convergence. This is the case of accelerated discrete rate of convergence using cone relations, where the usual rate of convergence is replaced by an accelerated rate of convergence over a time-step $\\tau$ to the original problem, but only if the target function $F(t)$ satisfies conditions (C1), (C2), (C3a), (C3b) from \\cite{Fonlena2015} (see Theorem 2.1 in \\cite{Gatis2015}). For instance, a number of scientific applications involve simulated experiments where one would like to illustrate the effectiveness of a method to solve a problem of interest using a certain numerical discretization. In such cases, it would be convenient to be able to approximate the discrete rate of"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradient. The gradient estimates are not contaminated by any solution configuration, except for the so-called \\emph{dual boundary}, i.e. the boundary for which $\\bar{b}$ equals 0. The effective computation of the gradient for the Landweber method indeed depends on the Almgren-Sibert gradient, and also on the error for the Goldfarb-McOwen gradient \\cite{GM}. For the Almgren-Sibert gradient, none of the solutions for which the dual boundary is crossed are needed. Since we assume that the solution $s_i$ is better than all others, no other solution interferes with the computation of the Almgren-Sibert gradient. For the Goldfarb-McOwen gradient, at the exact date of submission of this paper, we have only seen solution configurations that do not affect its error. As the level of tolerance $\\varepsilon$ is getting reduced, the Goldfarb-McOttman error will affect the Landweber iteration when $\\varepsilon$ is of the order of the gradient"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the approximate solution to converge in a neighborhood of the target solution. This is even true for the simplest case, $k' \\in \\{1, \\cdots, k_i\\}$, where we have the belief that the convergence can occur globally. The reason is that we use the continuity of $f(u, x_i) = g$ and $f(x_i, x) = 0$ as the \\emph{sufficient} condition for the convergence, which only holds locally. We observe that the proof of convergence actually shows that the approximate solution can be made to converge at a point, $x_{i+1} = x^*$, and stay there forever, even with time evolution. However, we observe that each step from $x_i$ to $x_{i+1}$ always \\emph{forces} the solution to $x_i$ and $x_{i+1}$ to satisfy the aforementioned continuity, and this continuation can be seen from the \\emph{continuity} term in (\\ref{eq:DualConsDualVarDisCon}). Therefore, the"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the gradient of the test function. $\\gamma \\in \\mathbb{R}$ is a constant and $f_{\\Omega}^j$ is the flux provided by the underlying $P$-based $L$-gradient flow scheme discussed in Section \\ref{section: examples } (for $N=2$, see Appendix \\ref{section: A2} for a similar scheme that does not involve reflections; for $N=3$, see Appendix \\ref{section: A3}). Note that if $\\nabla u \\geq 0$, then the gradient of $f_{\\Omega}^j$ must also be $\\geq 0$. However, this is no longer the case in the presence of reflections. In order to prevent $f_{\\Omega}^j$ from having negative components, if $u \\leq 0$, then $\\nabla u = 0$ and $f_{\\Omega}^j$ is zero everywhere except on the boundary, where the direction of the domain $\\Omega$ is changed (see Appendix \\ref{section: A2} for the $N=2$ case without reflections, where in"
        }
      ]
    },
    {
      "train_loss": 0.0450830078125,
      "update_counter": 6250,
      "epoch": 201.58064516129033,
      "val_loss": 6.267578125,
      "val_entropy": 0.649169921875,
      "val_ttr": 0.591796875,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the performance of the Configuration 1 and 2 compared to the performance of the Optimal Configuration when the reconstruction cost $c_r=1$ (on the right) and $c_r=0.5$ (on the left) in the deadline distribution of Figure~\\ref{deadline_distribution}. In both figures, the same signal strength distribution shown in Figure~\\ref{sig_strength_dist} is used. The horizontal lines in Figure~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} show the total number of signal users $N_{sys}$ in the system. As is clear from the figures, in configuration 1, which is not coded, all 8 antennas are assigned for signal recovery and the average gain per subchannel is about 1.8. As a result, the number of users that can be protected increases with the number of users in the system. In contrast, in configuration 2, which is coded, we assign 4 antennas for coding and the other four antennas"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for $In^\\alpha, In^\\beta$ defined in~(\\ref{standard_inclusions}) for different choices of $\\alpha$ and $\\beta$. The grids $K_i$ are generated from the command line in~\\cite{UBM} for $N_i = 2^i$. We set $\\eta_i = 0.9$ and $\\lambda_i = 0.9$ for all $i$. The trends in the last three rows are similar to the trends in the first three rows. For small $\\alpha$ and small $\\beta$ the error of the minimum inclusion is almost uniform and, thus, the error estimation is not very accurate. This can be seen from the last two rows. In rows three and four the estimation of the error of the middle inclusion is not accurate for large $\\alpha$ and large $\\beta$. This error is, however, less inaccurate than the error in the last two rows. One reason for this is that the gradients in the error of the middle inclusion are much smaller in rows three and four than in the previous two rows. For"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In the fluid subproblem, a posteriori errors by reconstructed levels of smoothness of the solution $f$ are analyzed based on the element basis. The multiresolution framework from Fourier analysis is extended here and large-scale fluid simulations are validated by a second order numerical approximation on coarse grids. The results imply that the solid subproblem has to be solved first and must be finished enough to allow for a second subproblem. We conclude this section by some numerical tests, which confirm the high accuracy for low element densities and prove the robustness of the results even for high densities and large element sizes. In contrast, the solid subproblem is solved rapidly using simple steps based directly on the discrete $C$-space $\\mathcal{H}_C$. Feature capture in each layer of the interpolating B-spline functions is accomplished using FEM. Screens from the numerical tests illustrating the high-quality solid boundary are given in Figure \\ref{fig:boundary}. Feature capture in the fluid subproblem is validated by the validation in the solid subproblem and results from a second order discretization are shown to be valid for very high resolutions"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to about $5\\%$ in the first $20$ frames presented in Fig.~\\ref{fig:image_hier}. Scans can also reduce errors between different floors. Looking at the errors between floors $1$ and $2$ in Fig.~\\ref{fig:image_hier}, one can observe that the errors on the pictures that are shared by both floors (i.e., the pictures of the entrance and the escalator on the left-hand side of Fig.~\\ref{fig:image_hier}) are still rather high. However, the in-service time of these objects is very high and the chance that a new item is placed on the corresponding area is very low. Therefore, the errors on these pictures can be improved by using data retrieval instead of scanning. The results obtained from the experimental analysis presented in~\\cite{Livne2014RAS} show that the use of a reference frame can improve the accuracy of mobile computing by $10-15\\%$ and the results on a laboratory experiment~\\cite{Shi2"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "sae}, it is shown that the dynamical analysis of CMEs is highly inconsistent in the journal logs available through the Internet websites of the S/C operators. Due to this, the statistical evaluation of the geometries of CMEs encountered by Jupiter is highly unreliable. To overcome this problem, a new classification method based on the Keplerian orbital parameters of CMEs along with the latitudinal and longitudinal parameters of encounter was developed and performed on the J-LOOS archive \\citep{Shaviv:2011}. The range of temporal coverage was extended by about 100\\% for the number of observed CMEs encountered by Jupiter compared to the journal-log only analysis. However, the response rate stayed the same about 10\\%. The difference in the CME geometries was examined for the two distinct timing schemes, journal-log only (J-LOOS beginning on Oct. 11, 2009) and journal-log plus S/C (J-LOOS beginning on Dec. 12, 2008). It was found that the ballistic CMEs"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015,EG2012}), in the case of computed tomography (CT) it is applied to define a new set of angles $\\theta_{\\star}^i$ for the basis B of the image space. Furthermore, it is proposed to set the angles $\\theta_{\\star}^i$ equal to the rotation angles $\\arc\\theta^i$ of the kernel \\eqref{kernel}. Thus, we have the idea to define the rotation $\\arc\\theta^i_{\\star}$ of the CT acquisition as well, such that the kernel $\\phi^{L,R}_{\\arc\\theta^i_{\\star}}$ fulfills the noise conditions from Theorem \\ref{thm:linearimageprojection}. A similar approach was taken in \\cite{Masuda2015}, where the angles $\\theta_{\\star}^i$ on the basis of the NURB basis are defined to be the rotation angles of the kernel, corresponding to the noise conditions. There, however, it is assumed that the noise conditions are satisfied for all directions on the image space, which is for a number of applications"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f'(x)$ and $f(x^*)'.$ While the Landweber iteration explicitly seeks for the numerical approximation of $f'(x)$ in order to reduce the iteration error, the spurious result shown in (e) of \\Cref{sec:example} suggests that such a strategy may not be effective. In fact, the gradient estimated by the Landweber iteration not only is inaccurate, but also wrong. The opposite direction of the railroad indicates a negative gradient value, which is obtained by using a finite interval to approximate $f^2(x)$. Such result is obviously incorrect. However, the heuristic principles suggest $x$ should be moved in the direction of $-A^{-1}b$ in order to get an upper bound for the objective $f(x)$. Therefore, the only choice for $x_{k+1}$ is the direction of $x_k$ towards $x^*$. Thus, the Landweber iteration can be simplified to the following simple iterative rule: $x_{k+1} = x_k - (b^T-(b^T"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the approximate solution on an individual subproblem, which may not coincide with the actual problem. However, we argue that the solution of the actual problem also will be solved on a subproblem, and the solution therefore will be exact. To justify this claim, we rely on the argument from Step 3.3 of Theorem \\ref{thm:thres_linear} (note that the same argument was used in Chapter 3 of \\cite{arayi2014development}). Let $w_{k+1}$ be the $k+1$-th iteration of the optimal solution of the $k$-th problem in the sequence, $w_k$. We claim that the solution $w_{k+1}$ is optimal for the problem $y_k = r$ if and only if $w_k$ satisfies all conditions required in the definition of $f^{*}}$. Clearly, this is true at the first iteration, and also by Assumption \\ref{a:af_linear} it will be true in the nonlinear case, because the functions $f_i^{(l)}$"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the gradient of the potential, which is sufficiently small for all $j\\in\\mathbb{N}$ such that the problem still has unique solutions. For that, we break down the whole problem into different parts and solve them individually. First, we verify that the matrix $A(\\zeta)$ has small eigenvalues using numerical computation. Secondly, we construct a solution $(u,\\lambda)$ to $(1.1.1)$ using these eigenvalues. Thirdly, we check the validity of $(1.1.2)$, i.e., $au \\geq -g$ for all $\\Omega$, where $a$ denotes the constant from Theorem \\ref{thm:fem} and $g$ is defined in (1.1.1). Fourthly, we verify that $\\lambda$ satisfies $\\int_{\\Omega} \\dd v = 0$ in the majority of instances. Finally, we summarize our findings in a $(u,\\lambda)$ that satisfies all these conditions. Note that we face a problem in evaluating $u$ at the boundary of $\\Omega$ since the condition $u=\\pi$ is"
        }
      ]
    },
    {
      "train_loss": 0.0327587890625,
      "update_counter": 6500,
      "epoch": 209.6451612903226,
      "val_loss": 6.603515625,
      "val_entropy": 0.63623046875,
      "val_ttr": 0.6090494791666666,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the performance of the Configuration 1 and 2 in the number of requests processed during the first non-zero time interval, and the performance in the response latency in the same time intervals. The results are obtained for the shortest timeline of one hour and the highest input rate of 800 input tuples per node a second. We observe that the average response time in configuration 2 is larger than in configuration 1 for small values of the processed tuples count (Table~\\ref{response_latency}), this happens because $P$-join in configuration 2 is not fully completed during a time interval and it affects the latency of the simplest queries (equi-join and equal-to-null query). However, for a large value of the processed tuples (more than 3000), configuration 2 has a lower average latency than configuration 1 due to larger batch sizes in it. We can see that in the first stage of the buffer (not full) the response latency is larger in configuration 2 than in configuration 1. This happens because while processing one tuple in configuration"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for $In_{\\alpha, \\beta} $ from~\\eqref{eq:Inalphabeta} on $[0,1]^6$ for fixed $\\alpha = 1$, $\\beta = 10$, and for fixed $\\alpha = 10$, $\\beta = 1$ where the grid sizes $T_k = 0.1$ are chosen such that $L_k = 1 $ for $k =1,\\ldots,m$ where $m$ is seen from Table~\\ref{fluid_residuals_uniform_equal}. We assume a boundary condition $u = 1$ on the boundary and a boundary condition on $\\partial_{\\nu} u = 0$ on all other parts of the domain. The results for the a priori error estimator is seen in Table~\\ref{fluid_residuals_uniform_equal} as well, where we use the same grid sizes $T_k = 0.1$ for a finite volume discretization on the same domain. As we see from the table, the a posteriori error is constant for the relevant time"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method depends also on the solid \\cite{HXD} and fluid approximations. In order to prevent early concrete curing, realistic fluid simulations are required at later times of the simulation, causing additional computational cost compared to simulations where the solid evolution is not taken into account. Therefore, a careful selection of the time steps has to be performed in order to satisfy several conflicting requirements. The fluid evolution scheme has to be solved at each time step, however too early time steps will cause the solid structure to be evolved outside of the expected curing time. Too large time steps will result in insufficient accuracy, whereas too small time steps will cause a significant increase of the overall computation effort. In order to avoid \\textit{ Explorationary Computing}, we propose a procedure based on the Concept of Fundamental Time Step (\\textit{FTS}), which we adapt to structural curing calculation. The idea is to fix the overall maximum time step, which we call the Fundamental Numerical Time Step (\\textit{FNT}). A computational task is solved in $M$ steps of size $dt$, meaning that $dt$ comprises $"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% on the B3 floor (Fig. 11b). In fact, the relative difference between the counted and the scanning measurements ex- presses on the diversity of the layouts: While the measurements on B3 seem random, high errors are also observed on the layouts with a larger diversity of antenna attachments, such as corridors or small islands, which are found on floors $1$ and $2.4$. On $1$ floor, the measurements show in particular a high dependence to the starting point, which is understandable given the 17 routers on this simple layout (see Figure 11a). However, by comparing the measurements on different floors, we can see that the floor errors are relatively stable and tend to be about the same independently of the starting point: $10\\%$ on B3, $9\\%$ on $1$ and $2.4$. Moreover, for all floors, the measurement errors seem high when not covering all patches and are related to high disagreement on several patches, as shown in Figure 11c. However, on a visual basis, the measurements"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "p5468}, it was shown that continuation channels of switchbacks with time evolution are well described by perturbation theory, which implies that switchbacks are not due to a perfect local equilibrium. In \\cite{Shi:2016p5283}, the same conclusion was also derived based on a ballistically propagated plasma pack-up model. Moreover, it was shown that the residual plasma pack-up from a shock weakens with distance due to the decreasing effects of perpendicular cooling, which means that the occurrence rate of switchbacks gets weaker away from the CME source. In \\cite{Valgushev:2015p5468}, the chances for a white-light coronal jet to transform a shock-driven flux rope to a switchback were discussed. Conclusions were derived that more consistent flux rope configurations closer to the source are more likely to produce clinical switchbacks. Better 3D structure of flux ropes expected closer to the source contains more magnetic field lines and more effective flux refilling after an event causing less likely chances for a flux rope to rotate about the exit"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2015,MAGA2016}), the lower constraint condition \\eqref{constraintcondition} does not be automatically be satisfied for all applications. During the training, we impose the constraint condition by setting~$s_{ij} s_{ij} \\geq 0$ and writing both the orientation angle~$\\theta_{ij}$ and the magnitude~$m_{ij}$ as a cumulative sum over time that is registered in a set of temporal snapshots~$\\{\\hat{x}(0), \\hat{x}(1), \\hat{x}(1VERSECURLATION} is a homology 1-handle with respect to the boundary operator $\\partial_1$ on $S^3$. The operation of including higher dimensional handles, so-called 2-handles and 3-handles, is given by a \\emph{dimension doubling construction} \\cite{Kaltenbacher2015}, which consists in taking a handle of dimension $d-1$ and adjoining a handle of dimension $d$ near handles from the former that have handle index $d-1$. The"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f^\\prime(x^*)$ and $f(x^*)$, we notice that for $x^{(i)} \\\u7b49\u4e8e x^{(i-1)} + \\Delta x^{(i-1)}$ the value $f(x^{(i)})$ is generated by the cumulative contribution of $\\Delta x^{(i-1)}$, which is very likely to be associated with a spurious local minimum. For these types of solutions, the number of Landweber steps is usually one, we just need to point that since $f^\\prime(x^*) \\far away from zero, $f(x*) = \\sum_{j=1}^n \\Delta x_j^2$ is far from the actual value, the error of these type solutions is very large, while the setting of the initial step size $0.1$ yields a small error, therefore the error of these type solutions are deviates the behavior of the heuristic, we should consider special cases for this situation. For $x^{(i)} \\\u7b49\u4e8e x^{(i-1)} + \\Delta x^{(i-1)}$ if the"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the approximate solution at each iteration, and one can neither provide a criterion to determine when the solution becomes acceptable, nor whether the solution would eventually converge. Moreover, although the approximate solution in the nonlinear case does not satisfy a regularity, we demonstrate that it still works well in actual applications. In comparison with the linear case, the approximate solution in the nonlinear case is more robust to small perturbations. For example, we conduct numerical simulations on approximating the generalized bi-harmonic problem in \\eqref{eq:GBH} (see Section~\\ref{SEC:GBH}), and find that the convergence is locally implemented. Moreover, we also show that the kernel and vector converge locally. In contrast, in the linear case ($\\mu=0$), the kernel and vector converge globally. It is worth mentioning that, although the convergence is proved to be local, the final converged solution already satisfies a well-optimized solution. In contrast to the linear case, the nonlinear case does not require normalization of the vector $U_{k}$ in order to satisfy a reasonable error. In fact, we conduct test cases to compare the convergence in"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the marginals of the unknown probability measure $\\mathbb{P}$ on $\\Omega$, $j\\in\\{1,2,\\ldots,k\\}$, are positive smooth functions. The objective of we want to find a solution to the problem $\\max_{\\mathbb{P}} \\mathbb{E}[\\xi]$, where $\\xi$ is some function of the probability measure $\\mathbb{P}$, which is difficult to define in the general probability model with $k$ possible outcomes $y_i$, $i=1,\\ldots,n$. However, if the variate $X$ is defined on a local probability space $(\\Omega, \\mathbb{P}, \\mathcal{F})$, where $\\mathcal{F}$ is a $\\mathbb{R}$-algebra and the national $\\mathbb{R}$-algebra of estenders of estenders of $\\Omega$ is chosen to be $\\mathcal{M}$, then a finite measure $\\mu$ is defined on $\\Omega, \\mathbb{R}^k\\subset \\mathcal{M}$. If $\\mathcal{H}$ is a sub-$\\mathbb{R}$-algebra of $\\mathcal{M"
        }
      ]
    },
    {
      "train_loss": 0.03211865234375,
      "update_counter": 6750,
      "epoch": 217.70967741935485,
      "val_loss": 6.70703125,
      "val_entropy": 0.6162109375,
      "val_ttr": 0.5791015625,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the performance of the Configuration 1 and 2 in the number of requests processed during the first non-zero time interval, and the performance in the response latency in both configurations. We observe that the performance of both configurations in the number of requests processed is similar in small time intervals of 1 second and 2 seconds. However, the performance of the configuration 2 is faster for larger time intervals such as 10 seconds and 20 seconds. In addition, the performance of both configurations is similar in terms of the average latency, and this performance is significantly better than the performance of Configuration 1. The performance of both configurations is similar in the small size of the time interval such as 1 second and 2 seconds, but the performance of the configuration 2 is faster for larger time intervals such as 10 seconds and 20 seconds. In addition, the average latency is performance which is significantly better than the performance of Configuration 1. Therefore, the configuration 2 is a better choice for the throughput and latency performance. Furthermore, we can see that, the radius of"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for $In^\\alpha, In^\\beta$ defined in~(\\ref{fixed_time meshes In}) for $\\alpha=\\beta=1$, $\\alpha=\\beta=0.1$ and $\\alpha=0.1,\\beta=1$. We set $s=j/T\\in(0,1)$, $p=2s$ and $q=p-1=1+2s-1=1+s(s-1)/2$, where $j=2\\pi n+1$ for $k=1,\\ldots,M$. Note that $In^\\alpha\\in BV(\\mathbb{T}^2)$ with a smooth boundary and $In^\\beta\\in BV(\\mathbb{T}^2)$ with two smooth edges. In~\\cite{eu88b} it is shown that $L_p$ estimates for the a posteriori error depend on the control of the kernel $h^2u^{l+2}$, where $h$ is the mesh size and depend polynomially on $p$. Our experience, however, suggests that the hybrid"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method depends also on the solid subproblem. In \\cite{AR2011} a hybrid method using a finite element method for the solid output values and a spectral spectral element method for solution of acoustic problems is proposed. An acceleration via an iterative solution of the solid subproblem using noisy solid values was proposed in \\cite{AR2013}. Also a precomputation strategy for the solid subproblem based on a precomputation of the expected future evolution of the solid input values was proposed in \\cite{AR2013}. A procedure for precomputation of solid values based on a solution of a generalized eigenvalue problem for the acoustic operator without noise and with suitable noise is suggested in \\cite{C2003}. An algorithm for precomputation of solid values based on a solution of an algebraic problem resulting from an interpolation of the noisy solid values to output points free from noise is proposed in \\cite{C2004}. An algorithm for prediction of solid values with sub-Nyquist accuracy is proposed in \\cite{C2006}. An"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to about 4\\% in the highest retrieved SNR of this study (about $18$ dB), which accounts for about $20\\%$ of the total errors in $2.4$ GHz. We note that the floor errors are mostly caused by the lack of entry in the overhead map, which is about $600$ meters in area, and inadequate direction of movement information. Furthermore, scanning frames reduce the number of outliers in the lifted vector map, as observed in Figure \\ref{fig:rec_snr_2}, which further account for about $10\\%$ of the errors in $2.4$ GHz. Again, the outliers occur because the mean map has missed some of the small cells, hence, the lifted vector map does not have the ceiling cells to \"anchor\" the small cells and thus, they get drifted away. Figure \\ref{fig:image_error_2} further account for the remaining $5\\%$ errors in $2.4$ GHz, which is due to the effect of the image correction in the scanning frames. However,"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "p5468}, it was shown that continuous perturbation of the system in respect to the universal radius $r_{2}$ produces a new extreme solution which is also an extreme solution and which is much smaller than the previously observed version. In other words, there exist cmEs such which their progenitor ejecta to be occurring in the inner region of the galaxy. Indeed, considering the solar angle $\\theta$ in the plane perpendicular to the SMR (the impact axis), we see that the\u5141\u8bb8 angle range is quite small (from $\\theta \\approx 0$ to $\\theta \\approx \\pi/2$), which means that such events occur  in the inner galaxy region. Moreover, examining the PC structure and the radial velocity shift of the CME, we see that in comparison with the classical cmEs this object has much more restrictive conditions. The PC structure of an CME is not very sensitive to the solar angle $\\theta$, which allows for a significant dependence of the coronal velocity on this angle and thus decreases the reliability of the channel hypothesis. Indeed, the ejecta velocity is significantly dependent on the solar angle $\\theta$"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Figiel2017} and references therein), the computational challenge remains extremely difficult. However, recent studies \\cite{Figiel2017,Hanel2018} \\cite{Figiel2017} have proposed a number of new numerical methods for directly solving the optimal control problem and optimizing the solution. In most cases the computational effort is overwhelming and it is effective only as a preliminary screening method. Therefore, an effective preliminary screening method is highly needed and we propose using the solution of the cone condition as a thresholding application for our preconditioning method \\cite{Hanel2017}. Thus, first solve the optimal control problem, then solve the cone condition using some available code. Then, we use the solution of the cone condition to determine the direction of the optimal solution on the mesh. The error resolution is of order $O(h)$ and the method is highly efficient. In fact, the main computational effort is the solution of the generalized eigenvalue problem for the projection matrix and it can be done even with a linear-linear"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f^\\prime(x)$ and $g(x)$. We observe that the function $h(x)=-x^2/2+3x$ has a spurious local minimum at $(0,0)$, and the function $h^*(-x)=1/2-3x^2/2$ is a efficient estimate for $f^\\prime(x)$. However, for the Landweber iteration, one can definitely improve on a single iteration the accuracy of the $f$ estimate by using the following strategy: choose $b$ at the output of the Landweber iteration as the previous $b$ was not sufficient, go to $x_{k+1}=x_k+b/(k+1)$, and then estimate $f^\\prime(x_{k+1})$ with $h(x_{k+1})$. This is effective because for this strategy, $h(x_{k+1})$ is not only efficient at time $k$, but is a closed form function that is not sensitive to later changes in $x_{k+1}$. The disadvantage of the"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the approximate solution at each iteration, and one can neither construct a sequence that converges to the general solution to the $k$-th order or $k + 1$-th order in the way similar to the case of a linear case, see Table \\ref{tab:comm_result_linear}. In the linear case, the underlying solution evolves by a linear discrete scheme, and one can in principle construct a sequence that converges to the general solution to any order. However, such a case is not applicable in the nonlinear case because the underlying solution evolves by a Newton-Kantrovitz scheme which is not linear, but highly nonlinear. However, we do observe that the general solution on which the sequence converges in the linear case has the same regularity as the approximate solution on each iteration in the nonlinear case. [We also would like to mention that we have carefully read the paper \\cite{fixedpointreport} for convergence in the sense of $L_2$ norm to proof of nonlinear case linear stability, where the authors also suggest a strategy for the convergence investigation.] In addition, from the quality of"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the gradient of the potential, which is sufficiently small for all $j\\in \\mathbb{N}$ such that the problem still has unique solutions. For each $j\\in \\mathbb{N}$, we break the vector $\\boldsymbol{\\beta}_j$ into two vectors $\\boldsymbol{\\beta}^{(1)}_j$ and $\\boldsymbol{\\beta}^{(2)}_j$, which are defined as $\\boldsymbol{\\beta}^{(1)}_j \\le \\boldsymbol{\\beta}^{(2)}_j$, which satisfy $\\boldsymbol{\\beta}_j^{(1)} \\le \\boldsymbol{\\beta}_j^{(2)}$ in the sense of $\\ell_1$ and $\\boldsymbol{\\beta}_j^{(1)} \\le \\boldsymbol{\\beta}_j^{(2)}$ in the sense of $\\ell_2$. The constraints of $\\left(\\boldsymbol{\\beta}_j^{(1)}, \\boldsymbol{\\beta}_j^{(2)}\\right)$ are expressed as $\\boldsymbol{\\beta}_j^{(1)} \\in \\mathcal{B}_{C_{\\Omega}} (\\boldsymbol{\\beta}_j^{(2)})$ in the sense of $\\ell_1$"
        }
      ]
    },
    {
      "train_loss": 0.031849609375,
      "update_counter": 7000,
      "epoch": 225.7741935483871,
      "val_loss": 6.7734375,
      "val_entropy": 0.6087646484375,
      "val_ttr": 0.5807291666666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the performance of the Configuration 1 and 2 in the number of users the system can support in the worst case. The performance of the bypass-based configuration is shown in Figure~\\ref{comparison_whole_timeline_configuration_1} in a low and high backlog size. The system can support 88 users in the high back log scenario and can support no more than 26 users in the low back log scenario. In addition, it is evident that the configuration of the registration module can support up to 10 users (or more than 80 kbps) even in the worst case where $K=3$ and $N=12$ are fixed. However, the performance of the endorsement-based configuration in Figure~\\ref{comparison_whole_timeline_configuration_2} shows that it can slightly improve the performance compared to Configuration 1. For example, the system can support 76 users in the high back log scenario and can support 30 users in the low back log scenario when $K=3$ and $N=12"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for $In_{\\alpha, \\beta}$ defined in~\\eqref{in_alphabeta}.  The parameter choices are given by~\\eqref{eq:a1_beta_conditions} and~\\eqref{eq:a2_alpha_conditions}.  The solution is obtained by solving the simplified problem~\\eqref{eq:problem_exact} in the interval $[0, 1]$ with $\\alpha = 1.0$, $\\beta = 0.7$ and $ d = 0.05 $ using the same non-linear solver as in Section~\\ref{sec:numerical_experiments}. From Table~\\ref{fluid_residuals_unif_equal} one can see that the a posteriori error is constant on each time step and equal to the value obtained at $t = 0$, which is impossible. In contrast to the a priori estimator, the a posteriori estimator requires no modification with respect to the shape of the intervals on $[0, t_k]$. However, the presence of such large errors in the estimation of the error on each time step indicates that"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method depends also on the solid subproblem, standard methods for finite element methods, like the Weiner iteration or the GLL sweeping method, have been recently applied to solve the solid substep and in some cases the solid subproblem has even been linear. \\cite{Neela} shows that linear solution of subproblem increases the overall numerical error by 10\\%. \\cite{Stand_Meso} is another numerical study for investigating different methods for the solid subproblem. However, such solutions are usually not valid in cases of complex boundary conditions, which may occur regularly on boundary faces in conjunction with use of some finite volume method for the fluid subproblem. Therefore, we consider  straight geometry with linear boundaries in later sections, where we test convergence for the solid subproblem. In this test, we prove that the iterative procedure proposed in this paper yields satisfactory results even for a linear solution of the solid subproblem. At last, we consider the numerical test, which shows the overall convergence of the procedure in cases with complex boundary conditions. The solid subproblem, which is preconditioned"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% on the B3rd floor. In Fig. \\ref{fig:image21}, the average power change between the ``Home\" and ``Living Room\" gRSSIs is significant in some rooms but remains small in the dining room and the kitchen. As a result, the RSU is not determined correctly in these two rooms about $15\\%$ of the time. An alternative approach here might be using the GPS data \\cite{locating_gps} or pre-correlating the floor data for each receiver \\cite{correlating_correlating}. Correlating floor data on a regular basis is an ongoing research work that the authors are engaged. Correlating the floor data for a large number of receivers is complex, since the floors are connected by many corridors, and one should also considers the non-correlations resulting from the entries and exits at the boundaries of each floor. The cordering of floors, therefore, requires considering a part of the corridor. \\cite{correlating_correlating} present their results on cordering in a"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "p5468}, it was shown that continuous perturbation of the system in respect to the international definition of the solar longitude allows one to population both the trans-Uranus and the post-Celebus\u5708 with values of the solar longitude less than $\\lambda_0=26^{\\circ}$. If we keep the source region CME in the direction of its trajectory, then the post-Celebus ring appears reasonable expected. However, it crashes due to the impact of the international transit model errors (see, for example, the Fig.3 in \\cite{Valgushev:2015p5468}). As a result, the post-Celebus circle receives significant perturbation from neighboring circles. The more unexpected problem appears in the strong population of the trans-Uranus circle. This circle receives the solar longitude from the Tachocline region near the ecliptic plane, where trans-Geminids circle should populating. This population is absent due to the continuous perturbation of the solar longitude definition in respect of the TraN model. Thus, the populating of"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Lu2015,Gbur2016,Yao2018}) the study of the convergence of the optimal solution to the geometric threshold condition \\eqref{geometricthreshold} has been focused only on the classical MPPC model with constant tangent cone \\eqref{mppcmodel}. In addition, a numerical simulation study was conducted in \\cite{Kaltenbacher2014}, thus a formal theory of conditions for convergence is still missing. Submitted to AMS, this paper provides such a formal convergence theory for optimal solutions of the partial differential equation system generating the numerical data for the classical MPPC model. In this context it is useful to recall that the geometric threshold condition \\eqref{geometricthreshold} holds by virtue of Theorem \\ref{convcond_constanttangentialcone} below. For the time being, the implicit function theorem \\cite{implicitFT} has been used in \\cite{Kaltenbacher2014} to derive the local existence of the solution of the evolutionary problem \\eqref{harmonicproblem} associated to the"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f$'s and $r$'s, which can be simplified when the solution evolves around local minimums. For the Landweber iteration, it always evolves toward the $k$-th iteration solution from the previous section, and hence the effective efficiency is not apparent. However, if the iteration evolves toward the spurious first minimum, efficiency might be unefficient. The effectiveness of the efficient methods can be confirmed by exploring if the solution evolution is moving toward the second local minimum. If yes, one can trust the effectiveness of the methods. However, if not, one should either explore other potential solutions, or explore more theoretical validation methods such as the economic sensitivity analysis alysis~\\cite{Veyrassas2018} for economic analysis, and system reliability analysis for reliability analysis. Also, as we have discussed in Section \\ref{sec:systemOptEff}, the stability of a system is always evolving close to the local minimum, and therefore the classical noise analysis would be efficient for such methods. One note is that we explore the efficiency of the solutions in this section only for the heuristic users"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the solution on a neighborhood of zero, which is independent of the convergence of the gradient. The proof of this property is actually a combination of the Linear case (Theorem \\ref{thm:basic_linear}) and the Global case (Theorem \\ref{thm:basic_global}). We note that the solution in the neighborhood of zero must satisfy the \\emph{new} equation $f(x) = 0$, instead of the original equation $x = 0$, where $f(x)$ can be different from $x$ itself. Such a difference approximately captures the error and also generates a new equilibrium. The argument of Theorem \\ref{thm:basic_global} shows that the convergence of the method in the nonlinear case is not globally uniform, but only local. We note that there are many methods that can be employed to ensure the method trajectory asymptotically converges to the global solution. The recent works \\cite{Dre2016,NoFi2017,NiJo2019} are the examples. However, the \\emph{optimized} solution prescribed by"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the gradient of the potential, which is sufficiently small for all $j\\in\\mathbb{N}$, is bounded from above, and satisfies a conventionally positive definition. Similar models are considered in \\cite{GSP08, CS10, GSP10}. The condition of stationary stability is satisfied for all $\\xi\\in \\Omega$, which is expressed as $0\\in \\partial_x \\left(\\phi_1(\\xi)\\right) \\cup \\partial_x \\left(\\phi_2(\\xi)\\right) $, where $\\phi_j(\\xi) = D_{x}(\\xi)F_j(\\xi)$ for $j=1,2$, and $D_{x}(\\xi) =\\{D_{x}(\\xi),\\| \\xi \\|< \\rho(\\xi)\\}$ denotes the smooth cone of the gradient in $\\R^N$ and $F_j(\\xi)$ for $j=1,2$ is given in \\eqref{eq: Fn1} and \\eqref{eq: Fn2}. In \\cite{CS1"
        }
      ]
    },
    {
      "train_loss": 0.03166845703125,
      "update_counter": 7250,
      "epoch": 233.83870967741936,
      "val_loss": 6.873046875,
      "val_entropy": 0.5968017578125,
      "val_ttr": 0.5729166666666666,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the performance of the Configuration 1 and 2 in the number of users (i.e., $L$): in Figure~\\ref{comparison_whole_timeline_configuration_1} it is possible to see that system size has a significant impact on the overall performance of the algorithm: in system size $N=480$ the algorithm requires about $80$ nodes (thus $T=80$, see Table~\\ref{the_tables_of_the_new_results_page}) and in system size $N=2400$ the algorithm requires $161$ nodes (thus $T=161$). On the other hand, in all the systems size under discussions the algorithm is very efficient (in the range of microsecond per node per day): for system size $N=480$ the algorithm runs in about $4.50$ microsecond per node per day and for system size $N=2400$ the algorithm runs in about $16.60$ microsecond per node per day"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for $In^\\alpha, In^\\beta$ defined in (\\ref{p_in_example}) with $\\alpha=\\beta$ equal to 0 or 1. The grid size is $H=1/1000$ and the control $R=100$. For $d_k=1$ we have $e_k=1$ in all rows of Table~\\ref{fluid_residuals_uniform_equal}. This case is addressed in~\\cite{wave_fluid_compars, PMP_wave_fluid_uniform, PMP_wave_fluid_H1_uniform, NT_wave_fluid_uniform} and can be considered as worst case. For $d_k=1$ and $ \\alpha=\\beta=1$ the a posteriori error estimator is very inaccurate, but its error function is still well defined. We see that the error estimator depends on $\\alpha$ and $\\beta$ and it is not defined for identical $\\alpha$ and $\\beta$ on the same grid. The error function has the form of $e"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method depends also on the solid subproblem. In \\cite{AR2008} a hybrid method using a finite element method for the solid state at each time step, and a spectral spectral element method (SA) for the fluid exchanges is proposed. The overall convergence rate in spaces $P$ for the fluid and $V$ for the solid is of order $\\sim$ $E$ for the flux at the boundaries for a total convergence rate of order $\\sim$ $E+P+V$. The numerical tests in \\cite{AR2008} demonstrate that the overall convergence rate depends mainly on the accuracy of the subproblem $E$. We propose a similar scheme in \\cite{AR2011b} using a finite element subproblem for the solid with a finite element exchange surface for the fluid. The overall convergence rate in spaces $P$ for the fluid and $V$ for the solid is presented in terms of the number of degrees of freedom in volumes $\\sim$ $E+P+V$. In \\cite{AR20"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% on the B3rd floor. In addition, the scanning frames do not seem to help on lower frequencies. On $100$ and $40$ MHz, the ECC is not reduced using a scanning frame, as shown in Fig. \\ref{fig:image17}. The windows of a normal interval might be too large and do not allow for a reliable correlation algorithm, or a data analysis algorithm, such as the particle swamp algorithm to run. Thus, a while loop is used to track the antennas and measure the missing ones, which is probably the reason for such a high EC on a few floors. Moreover, a normal interval might be updated too quickly by the antenna-antenna link level modeling in the global error matrix, i.e. GEC, where the same antenna can be connected to different antennas each time it is observed, such that a measured power level is associated to a different antenna, causing the antenna-antenna link power variation to be too large to be correlated with the antennas. Looking at the upper left diagram in Fig. \\ref{fig"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "p5468}, it was shown that continuous perturbations of the system gradually drag the streams into a spherical shape. The spherical shape corresponds to the host structure of the CME, which was a spherical region in the S/C \\citep[see][]{2015ApJ...802..150L}. Moreover, the presence of continuous small perturbations in the width of the CME-host as well as in its flight trajectory implies that the velocity profile of the S/C is affected. Thus, the equilibrium of the CME-host crossing the solar surface cannot exist in the presence of continuous small perturbations. Considering the imbalanced mass of the CME-host (the amount of the solar mass is relatively small), it is logical to expect that its trajectory will also change. Moreover, it was shown that the higher the angle of injection of the CME-host, the stronger the transformation and the more circular the final shape of the stream. All of this leads to the conclusion that the spherical equilibrium of the CME-host is not existing and that the observed phases of transformation are a natural search for the smallest"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2014,Lu2015,Gbur2016,Yao2018}) the study of the convergence of the optimal solution to the standard television minimization condition \\eqref{tvproblem} over the set of all Christianudov curves remains limited. \\cite{Kaltenbacher2014} mentions a justification of the cone condition under variation of the parameter \\emph{w.f.i.}, while \\cite{Lu2015} mentions a convergence analysis towards standard television minimization under the additional assumptions of regularizability and geometric stability. \\cite{Gbur2016} mentions a convergence analysis towards standard television minimization under the additional assumption that the parameter of Christianudov curves is well behaved at the singularities. \\cite{Yao2018} mentions a convergence analysis of the optimal solution to the standard television minimization condition \\eqref{tvproblem} under the additional assumption that the Cartesian version of the parameter well behaves at the singularities of the Christianudov curve. The study of the convergence of the optimal solution to the standard"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for $f$'s and $r$'s, and the effective efficiency of these efficient estimates is not obvious.  However, for the Landweber iteration, we observe that after a few iterations, the first threshold $h_k$ is estimated as around $r_k$, which is also the recovery rate at the end of these iterations, and this seems to be a natural estimate for $h$.  Thus, if the initial $h_0$ is a small value (i.e. \\textit{does not contain noise}), then the iteration converges to a multiple of that initial $h_0$, and therefore the iteration converges to a multiple of $k$, which implies that the iteration converges to a very small value.  Indeed, with sufficient iterations, the iteration converges to $k$ in the expected sense, even if $h_0$ contains noise.  However, if $h_0$ contains noise, then this method will require very few iterations for convergence, and so the minimum penalty will be approximately $\\lambda k$, which is not minimal, where a minimum penalty should"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the solution on a neighborhood of zero, which is independent of the robustness of the solution structure. The parameter vectors $w_{k+1}$ are obtained by applying a linear operator, i.e., the Kohn operator, to $w_{k}$ that close form expression is given in Appendix \\ref{proof_of_theorem_3}. The convergence of the parameter vectors $w_{k+1}$ is independent of the initial parameter vectors $w_0$ and the noise $L$ in the functional $f_{k}$ is a constant that is not included in the parameter vectors $w_{k}$. The convergence of the solution structure (optimal rate of convergence of the solution) is given in Theorem \\ref{theorem_solution_linear_case} or Theorem \\ref{theorem_solution_nonlinear_case} and is independent of $w_0$ and $L$. The conclusion of the theorem is independent of time iteration $k$ and the solution structure is appropriate for a linear-linear inference. The convergence rate of solution in the linear case is exactly the same as the"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the gradient of the potential, which is sufficiently small for all $j\\in\\mathbb{N}$ such that the problem still possesses solutions. For that, we break down the model into two parts: (i) Suppose the condition $u = \\phi$ is satisfied for a collection $\\phi \\in \\mathcal{K}_\\Omega$, where $\\mathcal{K}_\\Omega \\subset \\mathcal{U} \\subset \\mathcal{H}$ are open subsets of $\\Omega$ with $u \\in \\mathcal{K}_\\Omega \\Rightarrow u \\in \\mathcal{U}$; (i.e., $u_\\phi \\onlywindow{\tin}{\\ only} \\onlywindow{\tout}{\\ exists}{\\ means} u(\\phi); (ii) There exists a solution $u \\onlywindow{\tin}{\\ only} \\onlywindow{\tout}{\\ exists}{\\ to} \\onlywindow{\tout}{\\ only}{\\ only} \\onlywindow{\tout}{\\ only}{\\ only} \\onlywindow{\tout}{\\ only} {particular system of PDEs}. The collection $\\mathcal{K}_\\Omega$ exhibits that the ocean"
        }
      ]
    },
    {
      "train_loss": 0.03158154296875,
      "update_counter": 7500,
      "epoch": 241.90322580645162,
      "val_loss": 6.919921875,
      "val_entropy": 0.58740234375,
      "val_ttr": 0.6028645833333334,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison between our configuration I and configuration II based on the mean curvature in the last $1000$ time steps of the simulation for the following relevant parameters: $\\tau =1000$, $L_v = 1$, $L_h =1$, $N_v = 100$, $N_h = 100$, $\\lambda = 0.0005$, $\\bar est = 0.01$, $D = 1$, $\\zeta = 0.2$, $\\mu = 0.2$, $\\alpha = 0.0001$ and $\\beta = 0.000000008$. For configuration I we have $\\nu = 0.8$ and $\\mu_s = 4.75$. Notwithstanding the presence of dynamics, in both figures the mean curves close to the neutral line converge to a stationary constant value. To better notice the behavior of the mean curves, in both figures we have also plotted the corresponding stationary value in the background. Rapid dynamic"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for $In^\\alpha, In^\\beta$ defined in~\\eqref{eq:inclusions} and $\\alpha=\\beta=1/2$ in $\\mathbb{T}^2$. In these problems the results of the fluid discretization are perfectly synchronized. The main feature here is that even though the gradients and velocities are synchronized, due to the different formulations of the a posteriori errors in~\\eqref{eq:ep1} and~\\eqref{eq:ep2}, the errors never coincide. As expected the hybrid method outperforms the fluid-only method, however, in certain parts of the domain the a posteriori error is too small to justify the proposed step size approach and we recover the error convergence in the order of Euler. As an addition to the a posteriori error estimates, we also track the number of degrees of freedom in the fluid simulations and the number of vertices in the tensor network in a sequence of problems where the diameter of the underlying graph grows eventually to $2d=21$. The fluid simulations are highly sensitive to this development and the error convergence is not valid in the case of Table~\\"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". In the fluid subproblem, a preferable choice has been found to use adaptive discontinuous galerkin method with implicit Euler method on hybrid timestepping, which yields efficient solution of highly nonlinear and large algebraic problems together. The solid continuation is expected to be happier with a derivative method, and we have used implicit Euler algorithm for it, also on hybrid timestepping. Both subproblems are launched from the central subroutine call into their subproblems using \\cite{PetersenStehlit} staged coupling method which allows for a coherent use of global working vectors and building of working matrices for both subproblems without the need for pre-solving the solid problem. From the subproblems, results are obtained and are handed over to the next subproblem through the same procedure. This is then repeated over the timestep, and control variables in each step are synchronized with those from the neighboring timestep by referring to the integrated fluid output and solid position. This is done on implicit basis, which allows to smooth out effect of dislocation-metal interaction which happens right at the interface between solid step and output reading. Overall, the"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% on the B3rd floor. In addition, the results indicate that scanning allows to reduce the motion estimation errors on the second and third floors, as shown in Fig. \\ref{fig:error_images}. For example, look at the average error in the images corresponding to 2/02/2017 at the B0 floor (Fig. \\ref{fig:error_images}, a), where the use of scanning improves the average motion estimation errors to around 10\\% (red lines), as opposed to 30\\% in the non-scanning frames (blue lines). The same effect can be observed on the other floors, where the use of scanning reduces the motion estimation errors to around 10\\%  (Fig. \\ref{fig:error_images}, b to d). Thus, not only does scanning, as shown above, reduces the component of cross-floor errors but it also allows to reduce the motion estimation errors on all floors. It is also worth noting that the use of scanning, even in the $2.4$ GHz"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "gfa}, it was shown that to describe the motion of a CME across the solar wind, it (the CME) should have a very small, but positive value for the solar wind velocity (along with the zero value for the electric field). This condition follows from performing a transformation into the solar wind frame. And this means that such a process must occur when the CME leaves the Sun and moves into the solar wind. However, as it was shown in \\cite{Shi:2015QoM}, a perfect transition of a particle from the Sun's magnetic field to the solar wind magnetic field occurs only at one point and then the particle moves in the solar wind magnetic field with a constant distance from the Sun's surface. But since the solar wind velocity is a random variable, the magnetic field in which the CME moves changes in a moment, and therefore the chance for the process to occur is zero. This fact was discussed in more detail in \\cite{Webb:2017pwC}. It was shown that under the conditions of the non-relativistic exchange of the"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "stein2019,Hochstenbach2020}), the upper bound condition \\eqref{boundcondition} was used in the study of \\cite{Crawford2019_RAM,Dey2018_stability}. As a consequence, the same setting will be used in this paper. Note, however, that we will present a result (Theorem \\ref{remarkregardingfinalave}) that is unique to the cone \\eqref{canonicalcone}. \\cite{Crawford2019_RAM,Dey2018_stability} also consider using the upper bound condition together with a generalization of the classical angular symmetry, called spherical symmetry, called paraxial symmetry. The relationship between the cone condition and the upper bound condition is not clear, but it seems plausible that the results of this paper also hold for the paraxial formulation, as discussed in \\cite{Crawford2019_RAM,Dey2018_stability}, but we have not proven this on our side. Instead, we will focus on the side of"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradient. The local gradient can be used as an efficient estimate for the total cost function. We have noticed that for some large initial conditions the Landweber method produces a spurious local minimum. The effective gradient along the current search path seems to come from the gradient of the central condition. This value seems to be independent of $x$. For the last two iterations for which we were present, we used this efficient gradient instead of the true gradient. The final result was significantly improved and we were able to produce the final results with this procedure. For the two last iterations, the algorithm was changed to use the effective gradient instead of the true gradient. Note that the convergence is the same, but with less performance. We discuss this with the author and he suggested the effective gradient method. With the results presented here, the order of efficiency of the performances of the different methods is $7$-th order accuracy for the original gradient method, $6$-th order for the gradient method with effective gradient method, $5$-th order for the iterative method with effective gradient method and $4$-th order for the iterative"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the solution on a neighborhood of the initial data, which can be empty. In the linear case, one can prove a global convergence of the solution. We would like to mention that the criterion used in \\cite{IVE2018} to determine the convergence consistency in the nonlinear case also implies the local convergence. The justification of the consistency of consistency and convergence consistency is in \\cite{AC2018}. The establishment of such a criterion is motivated by the fact that the initial data in most realistic cases may include\u566a\u58f0, and the noise eventually helps to equilibrate the components of the underlying vector field, which brings the component-wise mappings to a common conclusion. We should note that the establishment of the convergence theory in the nonlinear case required more arguments than in the linear case, e.g.,  rigorous proofs of assumptions \\ref{ac_linear_step2} and \\ref{ac_linear_step3} in Theorem \\ref{thm_linear_reg} require additional \\emph{equilibrium stability} assumptions in \\cite{AC2018}. Moreover,"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the gradient of the potential, which is sufficiently small for all $j\\in\\mathbb{N}$ such that the problem still has reasonable conditions. Further, $\\delta\\in \\B$ is the diameter of the underlying obstacle $\\delta_\\Omega(\\zeta)\\coloneqq \\sup_{\\zeta\\in \\Omega}|\\zeta|$, and $ \\A \\in \\M$ denotes the size of the obstacle, that is $ \\A < \\delta\\B$. Note that a clear enough classification of the obstacles is not necessary since we assume that $ \\Omega_{\\star} $ is a subset of the image $\\image(\\phi_{\\nu})$ of a small perturbation of $\\Omega$ with respect to a small control noise \\(\\nu\\). The noise \\(\\nu\\) is a random variable characterizing the image of the source space $\\Omega$ under the map \\(\\phi_{\\nu}$. Thus, we can assume that the potential of $\\Omega$ is very simple, for example it is a simple cavity. However, in reality, the noise \\(\\nu$ generates a great variety of images \\(\\image(\\phi_{\\nu})\\) corresponding to different"
        }
      ]
    },
    {
      "train_loss": 0.06101708984375,
      "update_counter": 7750,
      "epoch": 249.96774193548387,
      "val_loss": 5.35546875,
      "val_entropy": 0.74853515625,
      "val_ttr": 0.5885416666666667,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison between our time-filter based configuration and the sliding window based configuration in tracking the same subwindows across frames in the evolution of {\\emph{PSP}} observations of a CME on 12 Jan. 2019. The white dashed curves in these figures correspond to the complete bounds of the subwindow. In both figures, subwindows are observed to start at time $t_s=\\pm1$ in Fig.~\\ref{comparison_whole_timeline_configuration_1} and $t_s=\\pm2$ in Fig.~\\ref{comparison_whole_timeline_configuration_2}. Since our tracking algorithm in Figure~\\ref{comparison_whole_timeline_configuration_1} uses the bounds from the bounding algorithm in Figure~\\ref{comparison_whole_timeline_bounding_configuration_2}, we apply the white dashed curves to help find the bounds of subwindows. Since the bounding algorithm in Figure~\\ref{comparison_whole_timeline_bounding_configuration_2} uses the subwindows bounded in the next time step, applying the bound from the bounding algorithm to find"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for the wave problem with square boundary conditions. We again choose grids with uniform spacing $h_i = h$ for all $i$. However, this time all volumes are equal and equal is also our convergence rates. As we expected, the estimator based on the triangular approximation performs best. Its accuracy seems to be independent of the wave height and close to the theoretical estimate~\\cite{Gol20}. The performance of the generalized additive model (GAM) is similar to the polynomial estimate for small wave heights, but it seems to underestimate the error in larger waves. One might expect that the GAM model could be improved by adding appropriate function spaces to its a priori estimation. For example, a spline model of order $q > 1$ would be sufficient to describe the error of dimension $d = 1$ in dimension $n = 2$ using eigenfunctions of the Stokes problem with $k = 1$. Table~\\ref{fluid_residuals_uniform_gamm} shows another model, where we consider the approximation using the Legendre polynomials of order $q = 4$. The accuracy of the"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method will ultimately be determined by the solution of the Stokes system. A large variety of approaches to the solution of this system have been proposed in the past (see \\cite{Ashford,Gazzola,GazzolaPSP,GazzolaMPSP,GazzolaSimFlex,GazzolaSimFlexPF,GazzolaSimFlexT,GazzolaSimFlexTPF,GazzolaSimFlexTPSP,GazzolaSimFlexTPSP,Legrand,Marre,GazzolaPPP1,GazzolaPPP2,Shewman,Stephan,Valdimarsson,Hochstenbach,Delaune,Wang,Yang,Qin,Guo,Weng,Chen,Zhou,Theen) with each approach having some advantages and disadvantages. The majority of these approaches can be divided into three groups, implicit methods \\cite{Ashford,GazzolaMPSP,GazzolaPSP,GazzolaSimFlex}, semi-implicit methods \\cite{G"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to 2\\% in the $40$ cm (or lower) frames. Fig.~\\ref{fig:scan_cover_summary} summarizes the coverage control benefits of scanning over fixed scheduling for two different baseline solutions: (a) a fixed partition size for every frequency and (b) a fixed partition size every 20 ms time slot. The solution under the first strategy is shown in Fig.~\\ref{fig:scan_cover_summary}(a), while the solution under the second strategy is shown in Fig.~\\ref{fig:scan_cover_summary}(b). The solution under the $20$ ms time slot is shown for illustration purposes as it is not a valid scheduling strategy in the real RSU as it results in too much interference for a fixed number of subscribers. The scan strategy with a $50\\%$ partition size for every $200$ ms (or $400$ ms every $1000$ ms) every $20$ m gap between every two scan lines has coverage coverage control benefits for the $2.4"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "caa} it is shown that the finite concentration of particles in a comet's orbit leads to a non-vanishing total mass flow from the comet domain to the inner solar system domain. This means that the equilibrium version of the CME does not exist in general. Moreover, the direct connection between states of the two domains in the two-domain model (as expressed through the total constraint in Eq.~\\eqref{total_constraint}) is broken. This means that the dynamical rules in the two domains are not fully integrated, and that there is a partial disconnection of the domains. This means that in general the orbital evolution of a particle in the comet domain is not the same as the orbital evolution of a similar particle in the inner solar system domain. This means that also the velocity of both the seed comet and of the simulated post-explosion CME must be defined and resolved accurately and accurately integrated. This requirement is particularly important for low-element counts simulations where the dynamical structure of the embryo is important in deciding the final evolution of a CME. Moreover, as it was shown by \\citet{rudawska20"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher13,Credit2013,Brin2013,Schrittwieser14,Bisti15}, Albeverio1988,Shirov1997,Burke1998,Carrasquillo1999,Weiss05,Brin2002) the constraint level domain has not been explicitly analyzed. For example, a study of the level set method and its numerical discretization was presented in \\cite{Brandt03}. The authors consider a level set method for the discretization of the level set function and prove that the solution of the discrete model is well-posed if the initialization condition is satisfied. However, the authors assume that the function $\\psi \\in C^2$ and hence the tangential cone condition \\eqref{tangentialconecondition} holds. This means that the authors have a method to solve the initial value problem for the level set equation (provided $\\psi\\in C^2$) if the constraint level domain $\\psi>0$ does not hold."
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the gradient. In the second panel of figure \\ref{fig:estigv}, we present the gradient efficiency for the local solution found in the first run of the example in figure \\ref{fig:ocde_example}. The bottom panel of the figure also shows the error $e_{k}$ in use of the effective gradient. The effective gradient is the gradient of the finite domain problem, multiplied by $-4\\pi/L$. The corresponding solution quality is at around 1000, which is comparable to the best solutions in the final display in figure \\ref{fig:ocde_example}. However, the construction of this solution is noisy and inefficient. The first Landweber iteration has already jumped to the wrong solution direction. The second application of the iteration is almost pure noise, and its gradient estimate is estimated approximately to be zero. We note that the initial direction is related to the choice of the initial direction, and we do not recommend to use this initial direction in practice. A better choice would be the perpendicular direction to the surface defect. Despite of the inefficient efficiency of"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the approximate solution at the control point $k$, and the following control point $k+1$. We observe that \\eqref{eq:Jff} is a direct consequence of the well-posedness of the global linear case. We do not have this luxury in the nonlinear case. To overcome this, we establish the derivation of $f(y,u^k)$ and $f(y,u^{k+1})$ using the one-step dynamics by using either the projection onto the finite space $\\mathcal{P}_k$ or onto the subspace $\\mathcal{V}_k$. This allows us to derive the error $e^{k}_f$ and to prove the convergence of $f(y^k,u^k)$ and $f(y^{k+1},u^{k+1})$. We can then prove the convergence rate of $f(y^k,u^k)$ and that of $f(y^{k+1},u^{k+1})$ with the help of the local boundedness of $e^{k}_f$. However,"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the surface $\\partial\\Omega$ in a small neighborhood of the boundary, i.e., $||g_j -\\nu||_{\\infty} < \\varepsilon$ for all $j=1,2,\\dots$. We say that $\\Omega$ is a _deep_ region if there exists a small positive number $\\delta$ such that the potential of $\\mathcal{F}_\\delta(\\Omega)$ is not well-defined, i.e., $v(\\delta) > 0$. Note that deep regions are distinguished by a different kind of problems, e.g., either the adverse effect of shallow regions, or multi-scale problems, which we will consider in the next section. However, the role of deep regions in the appearance of nonclassical solutions is still very important, and they should be taken into account. To remove the effect of deep regions, we first regularize the boundary and then transform the problem to the subdomain. To do so, we introduce another smooth bounded domain $\\Omega^\\sharp \\subset \\R^N$ with $C^{\\infty}$ boundary, close to $\\Omega"
        }
      ]
    },
    {
      "train_loss": 0.037248046875,
      "update_counter": 8000,
      "epoch": 258.03225806451616,
      "val_loss": 6.283203125,
      "val_entropy": 0.64404296875,
      "val_ttr": 0.6009114583333334,
      "val_gen_seqs": [
        {
          "context": "\t    (1, 11) (2, 11) (3, 11) (4, 11) (5",
          "generated_continuation": ", 11) (6, 11) (7, 11) (8, 11) (9, 11) (10, 11) (11, 11) (12, 11) (13, 11) (14, 11) (15, 11) (16, 11) (17, 11) (18, 11) (19, 11) (20, 11) (21, 11) (22, 11) (23, 11) (24, 11) (25, 11) (26, 11) (27, 11) (28, 11) (29, 11) (30, 11) (31, 11) (32, 11) (33, 11"
        },
        {
          "context": "In Figures~\\ref{comparison_whole_timeline_configuration_1} and \\ref{comparison_whole_timeline_configuration_2} we show",
          "generated_continuation": " the comparison of the maximum likelihood trajectory of each individual on the time-evolution curve for the configuration $\\lambda = 1/2$ and $\\lambda = 3/2$ respectively. In the first figure, the dashed lines represent the evolution of the median of the distribution $\\langle\\sigma_j^{\\alpha}\\rangle$ for all the agents rather than just the most probable one. As can be seen, the trajectory of each individual is consistent with the median of the distribution. In the second figure (a), the dynamics follows a positive evolution initially with a sharp peak at $t =12$ and then goes down gradually. The level of the peak seems to be depending on the proportion of females $r$ for which we expect a positive evolution. In fact, when $r = 0.8$, the initial peak has a large value of about 1.1 as can be seen in figure \\ref{comparison_whole_timeline_configuration_2}a,b. However, when $r = 0.4$, the initial peak is very small about 0.6 which"
        },
        {
          "context": "In Table~\\ref{fluid_residuals_uniform_equal} we show results of the a posteriori error estimator on a sequence of uniform time meshes",
          "generated_continuation": " for $In_{\\alpha, \\beta}$ defined in Eq.~\\eqref{In_alpha_beta} with $\\alpha = 1$, $\\beta = 1/2$ in the case of fixed resolution and scaling. We set $k_1 = k_2 = 2^n$ and $\\lambda_1 = \\lambda_2 = 1/4$ in~\\eqref{scaling_parameters}. In these simulations, $ELP_i = 1$ for all $i$, $T_{i+1} / T_i = 2.0837 i$, and $\\left\\langle \\chi_i \\right\\rangle = 1.0331 \\approx i^2$ for all $i$. In contrast to results for the suborder $El = 4$ shown in Table~\\ref{fluid_residuals_uniform_,} where the estimator's predictions were inaccurate, in the case of larger a posteriori error $ELP = 9$ the residuals produce results that resemble those expected for a finite volume method, as"
        },
        {
          "context": "Featuring distinct time scales in each of the problems, the use of multirate time-stepping schemes with adapted step sizes for fluid and solid is obvious",
          "generated_continuation": ". The overall accuracy of the method will ultimately be determined by the simplest regime with the shortest convergence path, that is in this case assuming that the solid evolution is expected to behave slowly as well and to display a similar convergence behavior. Therefore, we consider a first order scheme for the fluid equation and we develop a high order scheme for the solid movement computation based on second order phases. In this way, we expect to obtain a first order integration scheme with respect to time and a second order scheme with respect to the motion interpolation. \\cite{Hochstenbach-etal-08} demonstrates in this context that higher order schemes for the solid movement computation do not gain any advantage, when the solid boundary conditions are not accurate enough. The overall accuracy will be determined by the simplest regime with the shortest convergence path, that is in this case assuming that the fluid evolution is expected to exhibit a similar convergence behavior. Therefore, we look for a high order scheme for the solid movement computation based on second order phases. We describe the solid with a mixed finite element procedure using linear solid elements for the solid boundary and higher order elements"
        },
        {
          "context": "Scanning frames not only reduce the localization errors on the same floor in $2.4$ GHz but, even floor errors start at as high as ",
          "generated_continuation": "10\\% down to about $5\\%$ in the first \\(20\\) frames \\((\\text{floor error rate: 8\\%})$$\\), compared with the baseline. Moreover, the scan frames improve the low level motion detection in high frequency bands $8$ and $9$ by $60\\%$ and $50\\%$, respectively in \\(2.4\\) GHz and \\(70\\%\\) and \\(40\\%\\) in \\(2.6\\) GHz and \\(8\\) GHz, compared with the baseline. Then, the number of detected motion instances on each floor increases from $100$ to $1600$ in $2.4$ GHz, and from $100$ to $450$ in $2.6$ GHz compared with the baseline. Moreover, by looking at the response value of the signal for each detector in the region, it is evident that the scan frames improve the signal-to-noise in all floors for higher frequencies, such as $9$ and $10$ on \\(2.4\\) GHz, $7"
        },
        {
          "context": "Nevertheless, the more accurate consideration demonstrates that the original equilibrium version of the CME does not exist. In \\cite{Valgushev:2015",
          "generated_continuation": "gfa}, it was shown that duration of magnetic rigidity destruction provided by a well-defined central region of a CME, remains relatively small, compared to the interplanetary travel time. As a result, the region behind the CME is charged by a strong solar wind speed spike, which can be very disruptive to the solar wind acceleration process. This is particularly important for energetic particles, which take time to reach the equilibrium distances beyond the HICS. In fact, such an process was observed in observations \\cite{2020ApJS..246...88K}, and may be a fundamental role in energetic particle acceleration. Moreover, it is likely that such slow solar wind acceleration may continue up to 1~AU, and even leading to a maximum maximum energy of 50~MeV \\citep{2020ApJS..246...88K,2020ApJS..246...62M}. Thus, some charge separation must occur near the Sun, powered by the solar wind speed spike, and facilitated by the region in which magnetic structure becomes"
        },
        {
          "context": "Even though the tangential cone condition \\eqref{tangentialconecondition} holds for a number of different applications (see e.g.~\\cite{Kalten",
          "generated_continuation": "bacher2011,Ge2017,Pan2017,Li2017,Yang2017,YangLi2018,Yang2018} in addition to the standard setting from contact mechanics~\\cite{Mor_iniski2010,Flinger2013,Zaidi2014,Thisisonmimics,Mahajan2016,Galster2016,Kanemori2017}), for a nonlinear function $f(x)$ the computational intractability of (\\ref{tangentialconecondition}) means that for most applications it is better to assume that the tangential friction force satisfies an ordinary linear friction law. For example, a recent experimental study of \\cite{Yang2017} shows that the tangential friction force follows an ordinary linear friction law when the body makes quick turns or when it slips. This is an example of when the computational intractability of condition (\\ref{tangentialconecondition}) is important. Importantly, we can rescale the nonlinear"
        },
        {
          "context": "    \\item \\emph{The spurious first local minimum for Landweber iteration.} Recall that the effective performance of heuristic rules depend also on efficent estimates",
          "generated_continuation": " for the eigenvalues of $A$. The Landweber iteration uses the recent two iterations $x_{n+1} $ and $x_{n+2} $ to estimate the smallest two eigenvalues of $A$. If we take any system $x(t)=x_{0}$ for $x_{0}\\in\\mathbb{R}^n$, it is not hard to find time evolution $x(t)$ that makes $x(t_k)$ converge to a local minimum. The temporal set $\\{x(t_k)\\}_{k=1}^N$ contains\u65e0\u6570 clustered points. When we estimate the smallest two eigenvalues of $A$ using these temporal set, the estimated eigenvalues can be any number close to zero. However, we are only interested in the uniform sampling over the temporal set. Therefore, we use a fast convergence multi-dimensional scaling (MDSC) \\cite{Mousavian2012} to universalize the temporal set. For an explicit configuration, the spurious first local minimum can be eliminated by using this MDSC with the Landweber iteration. For an implicit"
        },
        {
          "context": "    \\item \\emph{Only local convergence in nonlinear case.} The established convergence theory in the nonlinear case is a local one: one can only prove convergence",
          "generated_continuation": " of the local solution. In our validation case 2, we have verified that the derived kernel matrices $K_{ij}^l$ encode most of the information of the original data and can be used identically to $K_{ij}$. Thus, the derived solutions $\\widehat{H}_l(t_m)$ can be considered as valid estimators of $H(t_m)$. We prove that if the initial value $H(t_k)$ is close to the true value $H(t_{k+1})$, then subsequent iterations are almost constant and thus very close to $H(t_k)$. This illustrates the automatic convergence property and the minimum requirement for the initial value $H(t_k)$ is enough to converge to only one iteration. Of course, this does not guarantee that we can directly use $H(t_k)$ as initial value for the numerical simulation as the numerical iteration may not stop in one iteration. On the other hand, we have also proved that the convergence does not depend on the linear subspace $\\mathcal{V}$ for the numerical solution. We can therefore validate"
        },
        {
          "context": "where $\\Omega \\subset \\R^N$, $N=2,3$ is a bounded and smooth domain, and $g_j$ models",
          "generated_continuation": " the gradient for $j=1,\\ldots,n$ is given by $g_j(\\zeta) = \\sum_{k=1}^N a_{j,k}\\zeta_k$, $a_{j,k}\\in\\mathbb R$, with $a_{j,k}>-1$ to ensure that $g_j$ is directionally differentiable. It is to be noted here that no concavity assumptions are made on the gradient $g_j$. We further assume that $g_j \\geq -2$ for all $j$, so that $g_j$ is nonnegative and has a directionally differentially chain ring property of order $j+1$ (see \\cite{Dua} for details). Note that such a model is consistent with recent investigations into gradient dynamics in marangoni flow (see \\cite{kuehne2016modelling, Yang_2020, Boscarino_2021}). We assume that the boundary condition is given by $g_{j} = 0$ on the boundary"
        }
      ]
    }
  ]
}