フォームのサブミット

4.10.22 フォームのサブミット

4.10.22.1 イントロダクション

このセクションは非規定です。

フォームがサブミットされるとき、そのフォームにあるデータは enctype に指定された形式に変換されてから、指定の method を使って action に指定された宛先に送信されます。

例えば、次のフォームをご覧ください:

<form action="/find.cgi" method=get>
 <input type=text name=t>
 <input type=search name=q>
 <input type=submit>
</form>

もしユーザーが、1 つ目のフィールドに "cat" を、2 つ目のフィールドに "fur" をタイプし、サブミットボタンを押したら、ユーザーエージェントは /find.cgi?t=cats&q=fur をロードするでしょう。

では、次のフォームを考えてみましょう:

<form action="/find.cgi" method=post enctype="multipart/form-data">
 <input type=text name=t>
 <input type=search name=q>
 <input type=submit>
</form>

ユーザーの入力が同じだったとしても、サブミットの結果はかなり違ってきます:ユーザーエージェントは指定の URL に HTTP POST することになります。そのエンティティボディは次のテキストのようになります:

------kYFrd4jNJEgCervE
Content-Disposition: form-data; name="t"

cats
------kYFrd4jNJEgCervE
Content-Disposition: form-data; name="q"

fur
------kYFrd4jNJEgCervE--
4.10.22.2 暗黙サブミッション

form 要素のデフォルトボタンは、フォームオーナーがその form 要素となるサブミットボタンのうち、ツリー順で最初のものとなります。

ユーザーエージェントがユーザーに暗黙的にフォームを送信させることをサポートするなら(たとえば、プラットフォームによっては、テキストフィールドにフォーカスがあたっているときに "enter" キーを押すと、フォームを暗黙的にサブミットします)、デフォルトボタンが定義済みアクティベーションビヘイビアを持つフォームに対してそうすることで、ユーザーエージェントはそのデフォルトボタン疑似クリック・アクティベーション手順を実行することになります。

結果的に、デフォルトボタンが無効なら、このような暗黙的サブミッションメカニズムが使われても、そのフォームはサブミットされません。(ボタンは、無効である限り、アクティベーションビヘイビアを持ちません。)

ウェブ上には、暗黙的にフォームをサブミットする方法がある場合にだけ利用できるページがあるため、ユーザーエージェントはこれをサポートすることが強く推奨されます。

フォームにサブミットボタンがない場合、暗黙的なサブミッションメカニズムは、そのフォームの中に暗黙的サブミッションをブロックするフィールドがひとつでもあれば、何もしてはいけません。そうでなければ、その form 要素から自身の form 要素をサブミットしなければいけません。

前段落の目的において、要素が form 要素の暗黙的サブミッションをブロックするフィールドである、というのは、その要素が input 要素であり、そのフォームオーナーform 要素で、その type 属性が次の状態のいずれかの場合を指します: Text, Search, URL, Telephone, E-mail, Password, Date, Time, Number

4.10.22.3 フォームサブミッションのアルゴリズム

form 要素 form が要素 submitter(通常はボタン)からサブミットされるとき、任意で submitted from submit() method フラグセットを伴いますが、ユーザーエージェントは次の手順を実行しなければいけません:

  1. Let form document be the form's Document.

  2. If form document has no associated browsing context or its active sandboxing flag set has its sandboxed forms browsing context flag set, then abort these steps without doing anything.

  3. Let form browsing context be the browsing context of form document.

  4. If the submitted from submit() method flag is not set, and the submitter element's no-validate state is false, then interactively validate the constraints of form and examine the result: if the result is negative (the constraint validation concluded that there were invalid fields and probably informed the user of this) then fire a simple event named invalid at the form element and then abort these steps.

  5. If the submitted from submit() method flag is not set, then fire a simple event that bubbles and is cancelable named submit, at form. If the event's default action is prevented (i.e. if the event is canceled) then abort these steps. Otherwise, continue (effectively the default action is to perform the submission).

  6. Let form data set be the result of constructing the form data set for form in the context of submitter.

  7. Let action be the submitter element's action.

  8. If action is the empty string, let action be the document's address of the form document.

  9. Resolve the URL action, relative to the submitter element. If this fails, abort these steps.

  10. Let action be the resulting absolute URL.

  11. Let action components be the resulting parsed URL.

  12. Let scheme be the scheme of the resulting parsed URL.

  13. Let enctype be the submitter element's enctype.

  14. Let method be the submitter element's method.

  15. Let target be the submitter element's target.

  16. If the user indicated a specific browsing context to use when submitting the form, then let target browsing context be that browsing context. Otherwise, apply the rules for choosing a browsing context given a browsing context name using target as the name and form browsing context as the context in which the algorithm is executed, and let target browsing context be the resulting browsing context.

  17. If target browsing context was created in the previous step, or, alternatively, if the form document has not yet completely loaded and the submitted from submit() method flag is set, then let replace be true. Otherwise, let it be false.

  18. Otherwise, select the appropriate row in the table below based on the value of scheme as given by the first cell of each row. Then, select the appropriate cell on that row based on the value of method as given in the first cell of each column. Then, jump to the steps named in that cell and defined below the table.

    GET POST
    http Mutate action URL Submit as entity body
    https Mutate action URL Submit as entity body
    ftp Get action URL Get action URL
    javascript Get action URL Get action URL
    data Get action URL Post to data:
    mailto Mail with headers Mail as body

    If scheme is not one of those listed in this table, then the behavior is not defined by this specification. User agents should, in the absence of another specification defining this, act in a manner analogous to that defined in this specification for similar schemes.

    Each form element has a planned navigation, which is either null or a task; when the form is first created, its planned navigation must be set to null. In the behaviours described below, when the user agent is required to plan to navigate to a particular resource destination, it must run the following steps:

    1. If the form has a non-null planned navigation, remove it from its task queue.

    2. Let the form's planned navigation be a new task that consists of running the following steps:

      1. Let the form's planned navigation be null.

      2. Navigate target browsing context to the particular resource destination. If replace is true, then target browsing context must be navigated with replacement enabled.

      For the purposes of this task, target browsing context and replace are the variables that were set up when the overall form submission algorithm was run, with their values as they stood when this planned navigation was queued.

    3. Queue the task that is the form's new planned navigation.

      The task source for this task is the DOM manipulation task source.

    The behaviors are as follows:

    Mutate action URL

    Let query be the result of encoding the form data set using the application/x-www-form-urlencoded encoding algorithm, interpreted as a US-ASCII string.

    Set parsed action's query component to query.

    Let destination be a new URL formed by applying the URL serializer algorithm to parsed action.

    Plan to navigate to destination.

    Submit as entity body

    Let entity body be the result of encoding the form data set using the appropriate form encoding algorithm.

    Let MIME type be determined as follows:

    If enctype is application/x-www-form-urlencoded
    Let MIME type be "application/x-www-form-urlencoded".
    If enctype is multipart/form-data
    Let MIME type be the concatenation of the string "multipart/form-data;", a U+0020 SPACE character, the string "boundary=", and the multipart/form-data boundary string generated by the multipart/form-data encoding algorithm.
    If enctype is text/plain
    Let MIME type be "text/plain".

    Otherwise, plan to navigate to action using the HTTP method given by method and with entity body as the entity body, of type MIME type.

    Get action URL

    Plan to navigate to action.

    The form data set is discarded.

    Post to data:

    Let data be the result of encoding the form data set using the appropriate form encoding algorithm.

    If action contains the string "%%%%" (four U+0025 PERCENT SIGN characters), then percent encode all bytes in data that, if interpreted as US-ASCII, are not characters in the URL default encode set, and then, treating the result as a US-ASCII string, UTF-8 percent encode all the U+0025 PERCENT SIGN characters in the resulting string and replace the first occurrence of "%%%%" in action with the resulting doubly-escaped string. [URL]

    Otherwise, if action contains the string "%%" (two U+0025 PERCENT SIGN characters in a row, but not four), then UTF-8 percent encode all characters in data that, if interpreted as US-ASCII, are not characters in the URL default encode set, and then, treating the result as a US-ASCII string, replace the first occurrence of "%%" in action with the resulting escaped string. [URL]

    Plan to navigate to the potentially modified action (which will be a data: URL).

    Mail with headers

    Let headers be the resulting encoding the form data set using the application/x-www-form-urlencoded encoding algorithm, interpreted as a US-ASCII string.

    Replace occurrences of "+" (U+002B) characters in headers with the string "%20".

    Let destination consist of all the characters from the first character in action to the character immediately before the first "?" (U+003F) character, if any, or the end of the string if there are none.

    Append a single "?" (U+003F) character to destination.

    Append headers to destination.

    Plan to navigate to destination.

    Mail as body

    Let body be the resulting of encoding the form data set using the appropriate form encoding algorithm and then percent encoding all the bytes in the resulting byte string that, when interpreted as US-ASCII, are not characters in the URL default encode set. [URL]

    Let destination have the same value as action.

    If destination does not contain a "?" (U+003F) character, append a single "?" (U+003F) character to destination. Otherwise, append a single U+0026 AMPERSAND character (&).

    Append the string "body=" to destination.

    Append body, interpreted as a US-ASCII string, to destination.

    Plan to navigate to destination.

    The appropriate form encoding algorithm is determined as follows:

    If enctype is application/x-www-form-urlencoded
    Use the application/x-www-form-urlencoded encoding algorithm.
    If enctype is multipart/form-data
    Use the multipart/form-data encoding algorithm.
    If enctype is text/plain
    Use the text/plain encoding algorithm.
4.10.22.4 フォームのデータセットの構築

フォーム form におけるフォームのデータセットを構築するアルゴリズムは、次のとおりです。フォーム form は、サブミッター submitter のコンテキストに入れられたものでも構いませんが、そうでなければ submitter は null です。

  1. Let controls be a list of all the submittable elements whose form owner is form, in tree order.

  2. Let the form data set be a list of name-value-type tuples, initially empty.

  3. Loop: For each element field in controls, in tree order, run the following substeps:

    1. If any of the following conditions are met, then skip these substeps for this element:

      • The field element has a datalist element ancestor.
      • The field element is disabled.
      • The field element is a button but it is not submitter.
      • The field element is an input element whose type attribute is in the Checkbox state and whose checkedness is false.
      • The field element is an input element whose type attribute is in the Radio Button state and whose checkedness is false.
      • The field element is not an input element whose type attribute is in the Image Button state, and either the field element does not have a name attribute specified, or its name attribute's value is the empty string.
      • The field element is an object element that is not using a plugin.

      Otherwise, process field as follows:

    2. Let type be the value of the type IDL attribute of field.

    3. If the field element is an input element whose type attribute is in the Image Button state, then run these further nested substeps:

      1. If the field element has a name attribute specified and its value is not the empty string, let name be that value followed by a single "." (U+002E) character. Otherwise, let name be the empty string.

      2. Let namex be the string consisting of the concatenation of name and a single U+0078 LATIN SMALL LETTER X character (x).

      3. Let namey be the string consisting of the concatenation of name and a single U+0079 LATIN SMALL LETTER Y character (y).

      4. The field element is submitter, and before this algorithm was invoked the user indicated a coordinate. Let x be the x-component of the coordinate selected by the user, and let y be the y-component of the coordinate selected by the user.

      5. Append an entry to the form data set with the name namex, the value x, and the type type.

      6. Append an entry to the form data set with the name namey and the value y, and the type type.

      7. Skip the remaining substeps for this element: if there are any more elements in controls, return to the top of the loop step, otherwise, jump to the end step below.

    4. Let name be the value of the field element's name attribute.

    5. If the field element is a select element, then for each option element in the select element's list of options whose selectedness is true and that is not disabled, append an entry to the form data set with the name as the name, the value of the option element as the value, and type as the type.

    6. Otherwise, if the field element is an input element whose type attribute is in the Checkbox state or the Radio Button state, then run these further nested substeps:

      1. If the field element has a value attribute specified, then let value be the value of that attribute; otherwise, let value be the string "on".

      2. Append an entry to the form data set with name as the name, value as the value, and type as the type.

    7. Otherwise, if the field element is an input element whose type attribute is in the File Upload state, then for each file selected in the input element, append an entry to the form data set with the name as the name, the file (consisting of the name, the type, and the body) as the value, and type as the type. If there are no selected files, then append an entry to the form data set with the name as the name, the empty string as the value, and application/octet-stream as the type.

    8. Otherwise, if the field element is an object element: try to obtain a form submission value from the plugin, and if that is successful, append an entry to the form data set with name as the name, the returned form submission value as the value, and the string "object" as the type.

    9. Otherwise, append an entry to the form data set with name as the name, the value of the field element as the value, and type as the type.

    10. If the element has a dirname attribute, and that attribute's value is not the empty string, then run these substeps:

      1. Let dirname be the value of the element's dirname attribute.

      2. Let dir be the string "ltr" if the directionality of the element is 'ltr', and "rtl" otherwise (i.e. when the directionality of the element is 'rtl').

      3. Append an entry to the form data set with dirname as the name, dir as the value, and the string "direction" as the type.

      An element can only have a dirname attribute if it is a textarea element or an input element whose type attribute is in either the Text state or the Search state.

  4. End: For the name of each entry in the form data set, and for the value of each entry in the form data set whose type is not "file" or "textarea", replace every occurrence of a "CR" (U+000D) character not followed by a "LF" (U+000A) character, and every occurrence of a "LF" (U+000A) character not preceded by a "CR" (U+000D) character, by a two-character string consisting of a U+000D CARRIAGE RETURN "CRLF" (U+000A) character pair.

    In the case of the value of textarea elements, this newline normalization is already performed during the conversion of the control's raw value into the control's value (which also performs any necessary line wrapping). In the case of input elements type attributes in the File Upload state, the value is not normalized.

  5. Return the form data set.

4.10.22.5 フォームサブミッションのエンコーディングの選択

ユーザーエージェントがフォームのエンコーディングを選択することになったら、次の副手順を実行しなければいけません。これは任意で ASCII 非互換許容 フラグセットを伴います:

  1. Let input be the value of the form element's accept-charset attribute.

  2. Let candidate encoding labels be the result of splitting input on spaces.

  3. Let candidate encodings be an empty list of character encodings.

  4. For each token in candidate encoding labels in turn (in the order in which they were found in input), get an encoding for the token and, if this does not result in failure, append the encoding to candidate encodings.

  5. If the allow non-ASCII-compatible encodings flag is not set, remove any encodings that are not ASCII-compatible character encodings from candidate encodings.

  6. If candidate encodings is empty, return UTF-8 and abort these steps.

  7. Each character encoding in candidate encodings can represent a finite number of characters. (For example, UTF-8 can represent all 1.1 million or so Unicode code points, while Windows-1252 can only represent 256.)

    For each encoding in candidate encodings, determine how many of the characters in the names and values of the entries in the form data set the encoding can represent (without ignoring duplicates). Let max be the highest such count. (For UTF-8, max would equal the number of characters in the names and values of the entries in the form data set.)

    Return the first encoding in candidate encodings that can encode max characters in the names and values of the entries in the form data set.

4.10.22.6 URL エンコードされたフォームデータ

フォームのデータセットのエンコーディングは、いろいろな意味で、常軌を逸した怪物みたいなものです。長年にわたる実装の問題と妥協のせいで、相互互換性の要件が求められているわけですが、最適な案が見つかりません。とりわけ、文字エンコーディングとバイトシーケンスの間で変換が何度も繰り返されるわけですが(ときにはネストされることもあります)、こういったひねくれた部分には細心の注意を払ってください。

application/x-www-form-urlencoded エンコーディング・アルゴリズム は次のとおりです:

  1. Let result be the empty string.

  2. If the form element has an accept-charset attribute, let the selected character encoding be the result of picking an encoding for the form.

    Otherwise, if the form element has no accept-charset attribute, but the document's character encoding is an ASCII-compatible character encoding, then that is the selected character encoding.

    Otherwise, let the selected character encoding be UTF-8.

  3. Let charset be the name of the selected character encoding.

  4. For each entry in the form data set, perform these substeps:

    1. If the entry's name is "_charset_" and its type is "hidden", replace its value with charset.

    2. If the entry's type is "file", replace its value with the file's name only.

    3. For each character in the entry's name and value that cannot be expressed using the selected character encoding, replace the character by a string consisting of a U+0026 AMPERSAND character (&), a "#" (U+0023) character, one or more ASCII digits representing the Unicode code point of the character in base ten, and finally a ";" (U+003B) character.

    4. Encode the entry's name and value using the encoder for the selected character encoding. The entry's name and value are now byte strings.

    5. For each byte in the entry's name and value, apply the appropriate subsubsteps from the following list:

      If the byte is 0x20 (U+0020 SPACE if interpreted as ASCII)
      Replace the byte with a single 0x2B byte ("+" (U+002B) character if interpreted as ASCII).
      If the byte is in the range 0x2A, 0x2D, 0x2E, 0x30 to 0x39, 0x41 to 0x5A, 0x5F, 0x61 to 0x7A

      Leave the byte as is.

      Otherwise
      1. Let s be a string consisting of a U+0025 PERCENT SIGN character (%) followed by uppercase ASCII hex digits representing the hexadecimal value of the byte in question (zero-padded if necessary).

      2. Encode the string s as US-ASCII, so that it is now a byte string.

      3. Replace the byte in question in the name or value being processed by the bytes in s, preserving their relative order.

    6. Interpret the entry's name and value as Unicode strings encoded in US-ASCII. (All of the bytes in the string will be in the range 0x00 to 0x7F; the high bit will be zero throughout.) The entry's name and value are now Unicode strings again.

    7. If the entry's name is "isindex", its type is "text", and this is the first entry in the form data set, then append the value to result and skip the rest of the substeps for this entry, moving on to the next entry, if any, or the next step in the overall algorithm otherwise.

    8. If this is not the first entry, append a single U+0026 AMPERSAND character (&) to result.

    9. Append the entry's name to result.

    10. Append a single "=" (U+003D) character to result.

    11. Append the entry's value to result.

  5. Encode result as US-ASCII and return the resulting byte stream.

application/x-www-form-urlencoded ペイロードをデコードするために、次のアルゴリズムが使われるべきです。このアルゴリズムは、U+0000 から U+007F の範囲の文字だけを使った Unicode 文字列から構成されるペイロードそのものを表す payload、デフォルトの文字エンコーディング encoding、そして、オプションで isindex フラグを入力として使います。これは、ペイロードが isindex コントロールを含んだフォームに対して生成されたかのように処理されることを指し示します。このアルゴリズムの出力は、name-value ペアのソート済みリストです。isindex フラグがセットされ、最初のコントロールが本当に isindex コントロールだったなら、その最初の name-value ペアの名前は空文字列となるでしょう。

どのデフォルト文字エンコーディングを使えるかは、ケースバイケースで決まりますが、一般的に、デフォルトとして使うべきベストな文字エンコーディングは、ペイロードを生成するのに使われたフォームがあるページのエンコーディングに使われたものです。良いデフォルトがなければ、UTF-8 が推奨されます。

isindex フラグはレガシーな用途だけに使われます。準拠 HTML ドキュメントにあるフォームは、このフラグセットでデコードされる必要があるペーロードを生成することはありません。

  1. Let strings be the result of strictly splitting the string payload on U+0026 AMPERSAND characters (&).

  2. If the isindex flag is set and the first string in strings does not contain a "=" (U+003D) character, insert a "=" (U+003D) character at the start of the first string in strings.

  3. Let pairs be an empty list of name-value pairs.

  4. For each string string in strings, run these substeps:

    1. If string contains a "=" (U+003D) character, then let name be the substring of string from the start of string up to but excluding its first "=" (U+003D) character, and let value be the substring from the first character, if any, after the first "=" (U+003D) character up to the end of string. If the first "=" (U+003D) character is the first character, then name will be the empty string. If it is the last character, then value will be the empty string.

      Otherwise, string contains no "=" (U+003D) characters. Let name have the value of string and let value be the empty string.

    2. Replace any "+" (U+002B) characters in name and value with U+0020 SPACE characters.

    3. Replace any escape in name and value with the character represented by the escape. This replacement must not be recursive.

      An escape is a "%" (U+0025) character followed by two ASCII hex digits.

      The character represented by an escape is the Unicode character whose code point is equal to the value of the two characters after the "%" (U+0025) character, interpreted as a hexadecimal number (in the range 0..255).

      So for instance the string "A%2BC" would become "A+C". Similarly, the string "100%25AA%21" becomes the string "100%AA!".

    4. Convert the name and value strings to their byte representation in ISO-8859-1 (i.e. convert the Unicode string to a byte string, mapping code points to byte values directly).

    5. Add a pair consisting of name and value to pairs.

  5. If any of the name-value pairs in pairs have a name component consisting of the string "_charset_" encoded in US-ASCII, and the value component of the first such pair, when decoded as US-ASCII, is the name of a supported character encoding, then let encoding be that character encoding (replacing the default passed to the algorithm).

  6. Convert the name and value components of each name-value pair in pairs to Unicode by interpreting the bytes according to the encoding encoding.

  7. Return pairs.

application/x-www-form-urlencoded MIME タイプのパラメータは無視されます。特に、この MIME タイプは、charset パラメータをサポートしません。

4.10.22.7 マルチパートのフォームデータ

multipart/form-data エンコーディング・アルゴリズム は次のとおりです:

  1. Let result be the empty string.

  2. If the algorithm was invoked with an explicit character encoding, let the selected character encoding be that encoding. (This algorithm is used by other specifications, which provide an explicit character encoding to avoid the dependency on the form element described in the next paragraph.)

    Otherwise, if the form element has an accept-charset attribute, let the selected character encoding be the result of picking an encoding for the form.

    Otherwise, if the form element has no accept-charset attribute, but the document's character encoding is an ASCII-compatible character encoding, then that is the selected character encoding.

    Otherwise, let the selected character encoding be UTF-8.

  3. Let charset be the name of the selected character encoding.

  4. For each entry in the form data set, perform these substeps:

    1. If the entry's name is "_charset_" and its type is "hidden", replace its value with charset.

    2. For each character in the entry's name and value that cannot be expressed using the selected character encoding, replace the character by a string consisting of a U+0026 AMPERSAND character (&), a "#" (U+0023) character, one or more ASCII digits representing the Unicode code point of the character in base ten, and finally a ";" (U+003B) character.

  5. Encode the (now mutated) form data set using the rules described by RFC 2388, Returning Values from Forms: multipart/form-data, and return the resulting byte stream. [RFC2388]

    Each entry in the form data set is a field, the name of the entry is the field name and the value of the entry is the field value.

    The order of parts must be the same as the order of fields in the form data set. Multiple entries with the same name must be treated as distinct fields.

    In particular, this means that multiple files submitted as part of a single <input type=file multiple> element will result in each file having its own field; the "sets of files" feature ("multipart/mixed") of RFC 2388 is not used.

    The parts of the generated multipart/form-data resource that correspond to non-file fields must not have a Content-Type header specified. Their names and values must be encoded using the character encoding selected above (field names in particular do not get converted to a 7-bit safe encoding as suggested in RFC 2388).

    File names included in the generated multipart/form-data resource (as part of file fields) must use the character encoding selected above, though the precise name may be approximated if necessary (e.g. newlines could be removed from file names, quotes could be changed to "%22", and characters not expressible in the selected character encoding could be replaced by other characters). User agents must not use the RFC 2231 encoding suggested by RFC 2388.

    The boundary used by the user agent in generating the return value of this algorithm is the multipart/form-data boundary string. (This value is used to generate the MIME type of the form submission payload generated by this algorithm.)

multipart/form-data ペイロードを解釈する方法に関する詳細は、RFC 2388 を見てください。 [RFC2388]

4.10.22.8 プレーンテキストのフォームデータ

text/plain エンコーディング・アルゴリズム は次のとおりです:

  1. Let result be the empty string.

  2. If the form element has an accept-charset attribute, let the selected character encoding be the result of picking an encoding for the form, with the allow non-ASCII-compatible encodings flag unset.

    Otherwise, if the form element has no accept-charset attribute, then that is the selected character encoding.

  3. Let charset be the name of the selected character encoding.

  4. If the entry's name is "_charset_" and its type is "hidden", replace its value with charset.

  5. If the entry's type is "file", replace its value with the file's name only.

  6. For each entry in the form data set, perform these substeps:

    1. Append the entry's name to result.

    2. Append a single "=" (U+003D) character to result.

    3. Append the entry's value to result.

    4. Append a "CR" (U+000D) "LF" (U+000A) character pair to result.

  7. Encode result using the encoder for the selected character encoding and return the resulting byte stream.

text/plain 形式を使うペイロードは、人が読めることが考慮されています。それらは確実にコンピューターに解釈できるものではありません。なぜなら、このフォーマットは曖昧だからです(たとえば、値の終わりの改行と、値の中にリテラルの改行を区別する方法がありません)。


※ 原文:http://www.w3.org/TR/2014/REC-html5-20141028/forms.html#form-submission-0