2021-04-18

【PostgreSQL】小数点以下のみ固定長フォーマットを施す

やりたいことは表題通り。外部アプリとの連携でそういう値が求められた。つまり、

は、それぞれ

となる必要がある感じ。 …ニッチなケースだなぁ。

基本的な記述は下記の通りで良いはず。

select to_char(n, ('FMMI9990.000'))

ただ、これだと整数部分の桁が超過した場合などが、####.###みたいな結果値になってしまい残念。フォーマット文字列の9が多い分には問題ないみたいだが。

そのため、動的にやるしかないと思う。もっと良い方法あるのだろうか。公式ページのto_char()の説明を読む限りではこれが最善っぽいのだが。

select to_char(n, ('FMMI'|| lpad('', length(n::text), '9') || '0.' || lpad('', 3, '0')))

テストデータ

with data(n) as (
values(null)
,(1)
,(1.2)
,(1.23)
,(1.234)
,(1.2345)
,(21.2345)
,(321.2345)
,(4321.2345)
,(54321.2345)
,(0)
,(0.2)
,(0.23)
,(0.234)
,(0.2345)
,(-1)
,(-1.2)
,(-1.23)
,(-1.234)
,(-1.2345)
,(-21.2345)
,(-321.2345)
,(-4321.2345)
,(-54321.2345)
,(-0.2)
,(-0.23)
,(-0.234)
,(-0.2345)
,(-5123456789087654321234567897654321.23456789)
,(-5123456789087654321234567897654321)
,(5123456789087654321234567897654321)
)

select n
      ,to_char(n, ('FMMI'|| lpad('', length(n::text), '9') || '0.' || lpad('', 3, '0')))
  from data

2023-11-24 追記

久しぶりにこの記事読み直したけど、これで良いじゃないか。。

select round(n, 3)::text

2020-08-02

【PostgreSQL】同じ計算が複数出る問い合わせにlateralは有効か？

プログラミング

今回やること

表題の通り。Aを求めれるのにカラムXとYが必要で、BやCも同様にカラムXとYあるいはAの結果が必要なケースを想定している。

group byなどの集約に関するパフォーマンスの向上などについては触れられている記事を見かけるが、あまり表題の使い方について触れている記事が見当たらなかった。見つけられなかっただけかもしれない。。

結論から言うと、こういう使い方でもパフォーマンスは上がった（ここでは書かないが、逆に、同じ計算が複数回出ないのにlateralを使うと遅くなってしまう。恐らくNestedLoop分のコスト。）。個人的には可読性も上がるので、使うタイミングさえ見誤らなければ積極的に使っていきたい。

また、「副問い合わせのネストした箇所で計算すれば？」という意見はあるだろうが、多分正論。でも個人的にネスト量が増えるのが好きではないのです…。いやまあ、lateralも副問い合わせだし、ネストしてるけど。少なくともメインのfromは副問い合わせじゃないじゃないですかー…。

環境

PostgreSQL 11.8, compiled by Visual C++ build 1914, 64-bit

テスト用テーブル

確認用に3つのテーブルを用意した。今回は意味のある計算は一切考えないが、「商品」と「税」と「データ」を対象に、「数量 * 単価」と「数量 * 単価 * 税率」と「数量 * 単価 + 数量 * 単価 * 税率」を求めることとした。

create table item (
  code text,
  price numeric,
  tax_code text,

  PRIMARY KEY (code)
);

create table tax (
  code text,
  rate numeric,

  PRIMARY KEY (code)
);

create table data (
  record_id bigserial,
  item_code text,
  quantity numeric,

  PRIMARY KEY (record_id)
);

テストデータの作成

1万件程度で試したい（10万を超えるような仕事をしていないので…）。全部同じだと計算にキャッシュが効きそうなイメージなので、一応多少はばらけるようにしているつもりだ。意味あるかもばらけているかもわからないが。

with a as(
  select generate_series(1, 26) c
)
insert into item
select
  chr(64 + c) as code,
  c * 100 as price,
  (c % 3)::text as tax_code
from a
;

insert into tax
values('0', 0)
     ,('1', 8)
     ,('2', 10)
;

with a as(
  select generate_series(0, 10000) c
)
insert into data(item_code, quantity)
select
  chr(65 + (c % 26)) as item_code,
  c % 10 + 1 as quantity
from a
;

素直に書いた場合

SELECT句にて計算を全て書く方法

select
  data.record_id
 ,data.quantity * item.price as r1
 ,data.quantity * item.price * tax.rate / 100::numeric as r2
 ,data.quantity * item.price + data.quantity * item.price * tax.rate / 100::numeric as r3
from data
left join
  item
on data.item_code = item.code
left join
  tax
on item.tax_code = tax.code

lateralを用いた場合

lateralで複数回用いる値を先に計算する形（moneyとtax_moneyのこと）。

select
  data.record_id
 ,work_lateral.money as r1
 ,work_lateral.tax_money as r2
 ,work_lateral.money + work_lateral.tax_money as r3
from data
left join
  item
on data.item_code = item.code
left join
  tax
on item.tax_code = tax.code
left join lateral(
  select
    data.quantity * item.price as money
   ,data.quantity * item.price * tax.rate / 100::numeric as tax_money
) work_lateral
on true

explain analyze の実行結果

素直に書いたケース

----- 実行計画 -----
Hash Left Join  (cost=54.42..487.25 rows=10001 width=104) (actual time=0.036..12.618 rows=10001 loops=1)
  Hash Cond: (item.tax_code = tax.code)
  ->  Hash Left Join  (cost=24.63..206.05 rows=10001 width=77) (actual time=0.027..2.306 rows=10001 loops=1)
        Hash Cond: (data.item_code = item.code)
        ->  Seq Scan on data  (cost=0.00..155.01 rows=10001 width=15) (actual time=0.011..0.475 rows=10001 loops=1)
        ->  Hash  (cost=16.50..16.50 rows=650 width=96) (actual time=0.013..0.013 rows=26 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 10kB
              ->  Seq Scan on item  (cost=0.00..16.50 rows=650 width=96) (actual time=0.005..0.008 rows=26 loops=1)
  ->  Hash  (cost=18.80..18.80 rows=880 width=64) (actual time=0.005..0.005 rows=3 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 9kB
        ->  Seq Scan on tax  (cost=0.00..18.80 rows=880 width=64) (actual time=0.004..0.004 rows=3 loops=1)
Planning Time: 0.127 ms
Execution Time: 12.794 ms
--------------------------------------------------------------------------------

lateralで書いたケース

----- 実行計画 -----
Nested Loop Left Join  (cost=54.42..687.27 rows=10001 width=104) (actual time=0.189..10.882 rows=10001 loops=1)
  ->  Hash Left Join  (cost=54.42..262.23 rows=10001 width=77) (actual time=0.148..3.887 rows=10001 loops=1)
        Hash Cond: (item.tax_code = tax.code)
        ->  Hash Left Join  (cost=24.63..206.05 rows=10001 width=77) (actual time=0.033..2.430 rows=10001 loops=1)
              Hash Cond: (data.item_code = item.code)
              ->  Seq Scan on data  (cost=0.00..155.01 rows=10001 width=15) (actual time=0.013..0.568 rows=10001 loops=1)
              ->  Hash  (cost=16.50..16.50 rows=650 width=96) (actual time=0.013..0.013 rows=26 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 10kB
                    ->  Seq Scan on item  (cost=0.00..16.50 rows=650 width=96) (actual time=0.006..0.008 rows=26 loops=1)
        ->  Hash  (cost=18.80..18.80 rows=880 width=64) (actual time=0.106..0.106 rows=3 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 9kB
              ->  Seq Scan on tax  (cost=0.00..18.80 rows=880 width=64) (actual time=0.005..0.007 rows=3 loops=1)
  ->  Result  (cost=0.00..0.02 rows=1 width=64) (actual time=0.000..0.000 rows=1 loops=10001)
Planning Time: 0.138 ms
Execution Time: 11.079 ms
--------------------------------------------------------------------------------

結果

何度か試したが、おおむね前者が12ms、後者が11msで若干の優位性が見られた。ちなみに、InsertのSQLを追加で発行し、dataテーブルが1110003件で試した場合は前者が1400ms、後者が1200msだった。

雑感

わりと使えることがわかって安心。もちろん、Nested Loopが実行されるため、複数回同じ計算が出るようなこういうケース以外では素直に書いた方が速い（といってこれも同じ程度の差異しかでないが）。処理速度も問題なんだけど、計算がからむと同じ内容なのかどうかを判断するのにも時間がかかるので、こういう形で共通化できるのは特筆すべき点だと思う。 select句にcase文の中に同じ計算を何度も入れられると読む側としてはMPがゴリゴリ削られる…(´・ω・｀)

ある意味で変数的に使えるので、可読性向上ということでどうだろうか。

おまけ

気になったので副問い合わせで計算した場合の結果も貼り付ける。これが一番速いと思ってたが違った。試した限りでは1400msという結果になった。

select 
  _money.record_id
 ,_money.r1
 ,_money.r2
 ,_money.r1 + _money.r2 as r3
from (
  select
    data.record_id
   ,data.quantity * item.price as r1
   ,data.quantity * item.price * tax.rate / 100::numeric as r2
  from data
  left join
    item
  on data.item_code = item.code
  left join
    tax
  on item.tax_code = tax.code
) _money

実行計画を見ると、素直に書いた場合の結果と一緒なんだよなぁ。計算結果って遅延評価みたいな感じなんだろうか？

----- 実行計画 -----
Hash Left Join  (cost=54.42..47991.07 rows=1110003 width=104) (actual time=0.037..1377.255 rows=1110003 loops=1)
  Hash Cond: (item.tax_code = tax.code)
  ->  Hash Left Join  (cost=24.63..20058.41 rows=1110003 width=77) (actual time=0.028..247.948 rows=1110003 loops=1)
        Hash Cond: (data.item_code = item.code)
        ->  Seq Scan on data  (cost=0.00..17101.03 rows=1110003 width=15) (actual time=0.012..63.836 rows=1110003 loops=1)
        ->  Hash  (cost=16.50..16.50 rows=650 width=96) (actual time=0.013..0.013 rows=26 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 10kB
              ->  Seq Scan on item  (cost=0.00..16.50 rows=650 width=96) (actual time=0.005..0.007 rows=26 loops=1)
  ->  Hash  (cost=18.80..18.80 rows=880 width=64) (actual time=0.005..0.005 rows=3 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 9kB
        ->  Seq Scan on tax  (cost=0.00..18.80 rows=880 width=64) (actual time=0.004..0.004 rows=3 loops=1)
Planning Time: 0.137 ms
Execution Time: 1395.279 ms
--------------------------------------------------------------------------------

参考

LATERALを使ってみよう | Let's POSTGRES

Studio ODIN - blog風小ネタ集 > SQL の LATERAL キーワード

2020-07-05

【PostgreSQL】シーケンスを特定のグループ別に発行する

前置き

表題の通り。テーブルAにrecord_idとline_noという項目が存在するとした場合に、record_id単位で新規にシーケンスを振りたい状況が出てきた。
※ record_idとline_noは1:Nの関係

こういう風に書けたら良かったのだけど、ウィンドウ関数ではないので書けないと怒られた。

nextval('MY_SEQ') over(partition by record_id)

解決策

↓これで意図した通りに動く。

case when lag(record_id) over(partition by record_id) is null then nextval('MY_SEQ')
     else currval('MY_SEQ')
 end

これらで使用しているのは簡単に言えば以下のような感じ*1。

lag()は指定したカラムの値が前の行と変わったらnullを返す処理
currval()は同一のセッション内で発行した直前のnextval()の値を返す処理

確認SQL

with data(record_id, line_no) as (
  values(10, 1)
       ,(10, 2)
       ,(10, 3)
       ,(11, 1)
       ,(11, 2)
       ,(12, 1)
)

select *
      ,case when lag(record_id) over(partition by record_id) is null then nextval('MY_SEQ')
            else currval('MY_SEQ')
        end as seq
  from data

結果

record_id	line_no	seq
10	1	1
10	2	1
10	3	1
11	1	2
11	2	2
12	1	3

雑感

「セッションAで↑のSQLを時間のかかるSQL追加して流し、セッションBでnextvalを流す」というようなテストをやってみたが、ちゃんとセッションAとBで取得されたシーケンスはそれぞれ独立していたので、重複する心配は無さそう。公式に書いてあってもどういう挙動をするか確認したくなるよね。

…ここまで書いておいてなんだけど、1:Nの関係にあるのは今回自分が必要になったケースで言えばJoinをかけているからで、副問い合わせで1の方のテーブルにシーケンスを発行すれば良かったのではと二日経って気付いた🤔

参考：

www.postgresql.jp

*1:詳細は下記の公式リンク参照

2020-04-22

【JavaScript】GAS + TwitterAPIで共有アカウントからつぶやく

前置き

知人が小さなコミュニティを運営している。まあ、サークルみたいなものだ。私はたまーに手伝いをしている。一員なんだけど活動は殆ど出来ていないや🤪

その知人が、「コミュニティのTwttierアカウントから各自が更新情報を発信できると運営が楽になるなー」というような発言からタイトルのようなことを思いついた。

概要

TwitterアカウントのIDやパスワードはメンバーには通知しない
必然的にアプリを何か挟まないといけない
コミュニティのページはWix.comで作られたメンバーページがある*1
メンバーサイトは、ログイン機能が必要なのである程度はクライアント側に情報を持たせても多分OK
GAS（Google Apps Script）でTwitterAPIを叩いて、コミュニティのページからはGASのAPIを叩く

詳細

Javascript「のみ」でTwitterAPIを叩いてみる - 動かざることバグの如し
 今から10分ではじめる Google Apps Script(GAS) で Web API公開 - Qiita

殆どこちらのブログを参考にさせてもらった🎉

GAS側

補足するとすれば、参考にしたブログとは詳細がちょっと変わっている点だと思う。参考にしたブログの環境は、多分ブラウザ（JQuery使ってるし）だし、ライブラリはリンク切れになっている。

fetch処理はGAS専用のに書き換え
ライブラリのファイルは、GASに中身をコピペしている（oauth.gsとsha1.gs）

f:id:souki-paranoiast:20200422021311p:plain — GAS_files

ちなみに、oauth.gsの中身はこちらで、 oauth.js · GitHub
sha1.gsの中身はこちらだ（↑のファイルの中に書いてあるリンク） http://pajhome.org.uk/crypt/md5/sha1.js

/**
 * @param request: {{
 *            parameter: {
 *                text: string, // ツイートする内容
 *                cbf: string // Callback function name
 *            }
 *        }}
 */
function doGet(request) {
  console.log(request);
  const requestBody = request.parameter;
  const text = requestBody.text;
  const callbackFunctionName = requestBody.cbf;

   const options = {
      method: "POST",
      apiURL: "https://api.twitter.com/1.1/statuses/update.json",
      // ★ ここを変更。TwitterAPIのキーを発行するページを見れば、名称が多少違っても雰囲気でわかるはず
      consumerKey: "",
      consumerSecret: "",
      accessToken: "",
      tokenSecret: "",
      body: text
  };

  const tweetResult = postTweets(options);

  const out = ContentService.createTextOutput();
  // セキュリティ的には微妙らしいが、JSONPとして扱う（そのつながりでGETを受け入れるようにする。意味違うんだけどしゃあない）
  out.setMimeType(ContentService.MimeType.JAVASCRIPT);
  out.setContent(callbackFunctionName + "(" + JSON.stringify(tweetResult) + ")");
  return out;
}

function postTweets(options) {
  const accessor = {
    consumerSecret: options.consumerSecret,
    tokenSecret: options.tokenSecret
  };
  const message = {
    method: options.method,
    action: options.apiURL,
    parameters: {
      oauth_consumer_key: options.consumerKey,
      oauth_version: "1.0",
      oauth_signature_method: "HMAC-SHA1",
      oauth_token: options.accessToken,
      status: options.body
    }
  };
  OAuth.setTimestampAndNonce(message);
  OAuth.SignatureMethod.sign(message, accessor);
  const url = OAuth.addToURL(message.action, message.parameters);
  const r = UrlFetchApp.fetch(url, {
    method: "POST",
  });
  let success;
  let content;
  try {
    content = JSON.parse(r.getContentText());
    success = r.getResponseCode() === 200;
  } catch(ex) {
    console.error(ex);
    content = {};
    success = false;
  }

  const twitterResult = success ? {
    success: true,
    msg: "ツイートしました",
    content: {
      postUrl: "https://twitter.com/" + content.user.screen_name + "/status/" + content.id_str
    }
  } : {
    success: false,
    msg: "ツイート中にエラーが発生しました。"
  };
  console.log(twitterResult);
  return twitterResult;
}

HTML側

若干不要な情報も入っていはいるが、テキストエリアに文字を入力し、ボタンを押したらAPIが叩かれる単純なものだ。 window.fetchやXMLHttpRequestでもjsonpはできるのかもしれないが、それを調べるのも面倒だったのでjqueryを利用している。それ以外はjqueryも使っていないので、おそらく標準だけでブラウザ側はいけるはず。

<!DOCTYPE html>
<html lang="ja">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>Use twitter api</title>
</head>
<body>
<main>
    <div>
        <textarea id="status"
                  placeholder="ツイートする情報を入力してください"
                  cols="60"
                  rows="4"
        ></textarea>
    </div>
    <div class="btn-container">
        <button class="tweet-btn" onclick="tweets();">ツイートする</button>
    </div>
    <div class="tweet-result">
    </div>

</main>
</textarea>
<div class="as-console"></div>
</body>
<script src="https://code.jquery.com/jquery-3.3.1.min.js"></script>
<script>
    var endpoint = "https://script.google.com/macros/s/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";

    function tweets() {
        var text = document.getElementById("status").value;
        if (text == null || text.length === 0) {
            alert("無言ツイートはできません");
            return false;
        }
        if (text.length > 140) {
            alert("長い: " + text.length);
            return false;
        }
        $.ajax({
            type: 'GET',
            url: endpoint,
            dataType: 'jsonp',
            data: {
                text: text,
                cbf: "cbf"
            }
        });
    }

    /**
     * @param res {{
     *     success: true,
     *     msg: string,
     *     content: {
     *         postUrl: string,
     *     }
     * } | {
     *     success: false,
     *     msg: string,
     * }}
     */
    function cbf(res) {
        if (!res.success) {
            alert(res.msg);
            return;
        }
        document.getElementById("status").disabled = true;
        var currentTweetElement = document.createElement("span");
        var link = document.createElement("a");
        link.href = res.content.postUrl;
        link.text = res.content.postUrl;
        link.target = "_blank";
        link.rel = "noopener noreferrer";
        currentTweetElement.appendChild(document.createTextNode("今つぶやいたツイート"));
        currentTweetElement.appendChild(link);
        document.getElementsByClassName("tweet-result")[0].appendChild(currentTweetElement);
    }
</script>
</html>

雑感

jsonpを普段使ったことがないため、ここ回りが一番時間かかった気がする。流れでCORSの勉強も少しすることになったため、まあ良かったかな。今回はTwitterAPIだったけど、別のAPIを叩くのにもGASは使えそう。メンバーサイトのようにある程度の機密性がない場合は、ある程度のエラーチェックや何かあった時の通知とかGAS削除とかリスク対応は必要だろうけど。。

GASでconstとかアロー演算が使えるようになったのが地味に嬉しい。

*1:ていうかCMSじゃないのかWix.comってhttps://support.wix.com/ja/article/%E3%83%AA%E3%82%AF%E3%82%A8%E3%82%B9%E3%83%88%EF%BC%9A%E3%82%B3%E3%83%B3%E3%83%86%E3%83%B3%E3%83%84%E3%83%9E%E3%83%8D%E3%83%BC%E3%82%B8%E3%83%A1%E3%83%B3%E3%83%88%E3%82%B7%E3%82%B9%E3%83%86%E3%83%A0%EF%BC%88cms%EF%BC%89

2020-02-03

【TypeScript】JSの暗黙の型変換に頼っていた文字列の数値変換はどう書くのか

プログラミング

前置き

例によってJSからTSに書き換える作業中の出来事を抽出。 javascript - 0とかjavascript minus zeroとかjavascript 明示的型変換とか幾つか調べたけど直接的な答えはヒットしなかった。JavaScript畑の人たちには当たり前の事実なんだろうか。教えてほしい。

概要

JavaSciptで文字列を数値に変換する方法は幾つか存在するが、比較的よく使われるイディオムに下記のものがある。

const a = "1" - 0;
console.log(a, typeof a); // 1, "number"

これを、TypeScriptではどう記述するのか？という問題。

const a = "1" - 0; // Compile error!!

つまり、正確に言えばTypeScript関係ない。でもJavaScriptを書くときに、結構意識しない気がするんだよねコレ。暗黙の変換を使用して問題になることの方が少ないだろうし。
少なくとも詳細な動きまでは私は意識したことない。そもそも個人的にnullとかundefinedや数値変換できないような文字列とかに対してやりたくないんだよね。無駄に確認項目増えそうで。

回答

殆どのケースではNumber()が同じ動きになる。
殆どのケースというか、多分これで正しいのだけど、正確な答えは正直わからなかった。。。英語ﾑｽﾞｶｼｲ

const obj = {
    valueOf: function() {
        return 123;
    }
}
console.log(obj.valueOf()); // 123
console.log(obj - 0); // 123 // 文字列 -> 数値変換の場合はvalueOf()の戻り値が使われる。文末の参考リンクを参照
console.log(Number(obj)); // 123
console.log(parseFloat(obj)); // 123
obj++; // ちなみにincrementもできる
console.log(obj); // 124

メモ

「文字列 - 数値」を計算する際に、文字列は数値へ暗黙の型変換が行われる。そのロジックはECMAScriptの仕様書のここあたりに記載されている。 Number()へのリンクとかが貼られていれば確証が持てるのだけど、パース手順みたいなのが書いてあるんだよねBNF的な。時間あるときにちゃんと読むかな。正直そんな時間あるならこんなこと意識しなくて済むような記述にすれば良いと思う。折角TypeScriptなんだし。回避できる言語仕様は回避した方が脳のリソースを無駄にしなくて良いし！というと言い訳になるか。

補足

new Number()とNumber()は違う。前者はラッパーオブジェクトを作成するし、後者はプリミティブオブジェクトを生成する。MDNの例が詳しい。
parseFloat()ではダメ。parseFloat(null)とnull - 0とNumber(null)の結果でわかる。

参考

https://qiita.com/uhyo/items/44c2f79873de13186743
http://ecma-international.org/ecma-262/6.0/index.html#sec-tonumber
https://developer.mozilla.org/ja/docs/Web/JavaScript/Reference/Global_Objects/Number
雑感
// @ts-ignore で逃げる
Number()を信じる …どうしたものか。確実に動きが変わらないのは前者なんだけどね。

追記（2020-03-06）

なんて思ってたら、「+」のイディオムは使えるようですよ(´・ω・｀) bigintでは使えないようですが、十分な気はする。

let aaa = "123";
console.log(typeof aaa);  // string
console.log(typeof +aaa); // number

aaa = null;
console.log(typeof aaa); // object
console.log(typeof +aaa); // number
console.log(+aaa); // 0
console.log(Number(aaa)); // 0


aaa = 123n;
console.log(typeof aaa);
// console.log(typeof +aaa); // Uncaught TypeError: Cannot convert a BigInt value to a number
// console.log(+aaa); // Uncaught TypeError: Cannot convert a BigInt value to a number
console.log(Number(aaa));

参考URL：

https://qiita.com/uhyo/items/cc92a553059274d85403#%E5%8D%98%E9%A0%85%E6%BC%94%E7%AE%97%E5%AD%90

2020-01-30

【TypeScript】interfaceの一つの要素を型で指定する方法

プログラミング

前置き

TypeScriptに完全に置き換える場合は、この内容は全く不要です。 JavaScriptをTypeScriptに書き換えるために、なるべく書き換えずに、かつ型安全にしていくためのものです。

元のコード（JavaScript）

var ConstantsTest = function () {};
ConstantsTest.MODE = {
  A: 0,
  B: 1,
  C: 2
};

置き換えた後のコード（TypeScript）

interface Mode { // typeでもOK。最近の主流はtypeっぽい？
    A: 0,
    B: 1,
    C: 2
}

export default class ConstantsTest {

    public static MODE: Mode = {
        A: 0,
        B: 1,
        C: 2
    };

    public static isModeB(mode: Mode[keyof Mode]): boolean {
        return mode === ConstantsTest.MODE.B;
    }

    public static test() {
        console.log("ConstantsTest.isModeB(ConstantsTest.MODE.A)", ConstantsTest.isModeB(ConstantsTest.MODE.A)); // false
        console.log("ConstantsTest.isModeB(ConstantsTest.MODE.B)", ConstantsTest.isModeB(ConstantsTest.MODE.B)); // true
        console.log("ConstantsTest.isModeB(ConstantsTest.MODE.C)", ConstantsTest.isModeB(ConstantsTest.MODE.C)); // false
        console.log("ConstantsTest.isModeB(0)", ConstantsTest.isModeB(0)); // false
        console.log("ConstantsTest.isModeB(1)", ConstantsTest.isModeB(1)); // true
        console.log("ConstantsTest.isModeB(2)", ConstantsTest.isModeB(2)); // false
        // console.log("ConstantsTest.isModeB(3)", ConstantsTest.isModeB(3)); // compile error
    }
}

この記事で書きたかったこと

interfaceの一つの要素を型で指定する方法がわからなかったけど、解決したので勢いで書いてたりする。

mode: Mode[keyof Mode]

雑感

PartialとかPickとかあると便利そうなものが定義されているから、その中にあるもので対応できるかと必死に探したのになー(´・ω・｀) 考えてみれば結構シンプルなんだけど、まだTypeScript脳にはなれないようだね。
というか、こういう型が必要なケースって少ないのかな？

でも、やっぱりリテラル型のunionでtype宣言して素直に書き換えた方が良い気がする。最終的にタイトル変えたけど、元々は「【TypeScript】JavaScriptで定数として定義していたものの型を引数で指定するには」だった。移行作業結構面倒だという愚痴のニュアンスが…

追記（2020/02/11）

実体から型を取り出す方法typeof Tを知ったので試してみた。が、少し試してみた結果、厳密には型情報が抜き出せないっぽい様子？ isModeBの引数はnumberを取ると思われてしまった。

export default class ConstantsTest {

    public static MODE = {
        A: 0,
        B: 1,
        C: 2
    };

    public static isModeB(mode: (typeof ConstantsTest.MODE)[keyof typeof ConstantsTest.MODE]): boolean {
        return mode === ConstantsTest.MODE.B;
    }

    public static test() {
        console.log("ConstantsTest.isModeB(ConstantsTest.MODE.A)", ConstantsTest.isModeB(ConstantsTest.MODE.A)); // false
        console.log("ConstantsTest.isModeB(ConstantsTest.MODE.B)", ConstantsTest.isModeB(ConstantsTest.MODE.B)); // true
        console.log("ConstantsTest.isModeB(ConstantsTest.MODE.C)", ConstantsTest.isModeB(ConstantsTest.MODE.C)); // false
        console.log("ConstantsTest.isModeB(0)", ConstantsTest.isModeB(0)); // false
        console.log("ConstantsTest.isModeB(1)", ConstantsTest.isModeB(1)); // true
        console.log("ConstantsTest.isModeB(2)", ConstantsTest.isModeB(2)); // false
        console.log("ConstantsTest.isModeB(3)", ConstantsTest.isModeB(3)); // compile error にならない...
        console.log("ConstantsTest.isModeB(\"\")", ConstantsTest.isModeB("")); // compile error
    }
}

2019-07-19

【Java】【Hugo】ExcelTable_to_HugoTable

プログラミング

まあタイトル通りです。英語が正しいかは置いておいて。

HugoというMarkdownで記述できる静的サイトジェネレータを使用することがあるのですが、Markdownでテーブルを記述するのは非常に面倒です。

ExcelからMarkdownに変換するツールがWebアプリで転がっていたり、プラグインなんかで提供されていますが、Webアプリはセルの中で改行してたら上手くパースしてくれないし、プラグインをインストールするほどでもない（そもそもVS Codeは遊び以外では使わないし、IntelliJでは少し調べた感じ、ちょっと用途に合いそうなのが無かった）。

それに、HugoのMarkdownパーサーが若干他のと違います（違う気がするだけ？）。 HackMDというのを以前愛用していましたが、そっちでOKなものがHugoだとダメだったり。

ということで、自分の用途を考えるとそんなに難しいものでもないので作りました。 Javaではありますが、1ファイルに収まる && ライブラリなしなのですぐに動かせます。

満たしたかった要件

インストールとかそういったことはしたくない
簡単に動かせること（本当はexeファイルにしたかった）
Excelのセル内改行に対応
Hugoで動くこと

ブログ用に多少は整えたけどクソコード感は否めない(´･ω･`)

import javafx.util.Pair; // 標準にあったから使っているだけで、Tuple等があるならそれでOK

import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.*;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

public class Main {
    /**
     * 引数に元ネタのテキストファイルを。同じディレクトリに__output.txtという名前で出力する
     */
    public static void main(String[] args) throws Exception {
        if (args == null || args[0] == null || args[0].isEmpty()) {
            System.out.println("Usage: java Main ${INPUT_FILE_ABSOLUTE_PATH}");
            return;
        }
        Path dir = Paths.get(args[0]);
        List<String> header;
        List<List<String>> dataList;
        {
            String pureTable = pureTable(dir, Charset.forName("Windows-31J")); // Windows環境なのでデフォだとこれになるので。。
            Pair<List<String>, List<List<String>>> pair = split(pureTable);
            header = pair.getKey();
            dataList = pair.getValue();
        }

        List<Integer> maxLengthPerColumn = maxLengthPerColumn(header, dataList);

        List<List<String>> table = toMarkdownTable(header, dataList, maxLengthPerColumn);

        String result = table.stream()
                .map(e -> String.join(" | ", e))
                .collect(Collectors.joining("\n"));

        Files.write(dir.getParent().resolve("__output.txt"), result.getBytes(StandardCharsets.UTF_8)); // outputはこっちで良いでしょう
    }

    /**
     * Excelの表組をテキストエディタ等に貼り付けた時にできる文字列（セル内に改行があると"で区切りられたりするアレ）を、改行を無視したプレーンなTSV形式にする
     */
    private static String pureTable(Path inputFile, Charset charset) throws Exception {
        String input = Files.lines(inputFile, charset)
                .filter(Objects::nonNull)
                .collect(Collectors.joining("\n"));


        StringBuilder builder = new StringBuilder();
        StringBuilder wkBuilder = new StringBuilder();
        boolean sameCell = false;
        boolean prevIsEscape = false;
        for (char c : input.toCharArray()) {
            if (c == '\\') {
                prevIsEscape = true;
                continue;
            }
            if (c == '"' && !prevIsEscape) {
                if (sameCell) {
                    builder.append(wkBuilder);
                } else {
                    wkBuilder = new StringBuilder();
                }
                sameCell = !sameCell;
            } else {
                if (sameCell) {
                    if (c != '\n') { // 改行は消してもええやろ
                        builder.append(c);
                    }
                } else {
                    builder.append(c);
                }
                prevIsEscape = false;
            }
        }
        return builder.toString();
    }

    /**
     * タイトル部とデータ部をそれぞれセル単位に分割する。
     */
    private static Pair<List<String>, List<List<String>>> split(String pureTable) {
        String[] wk = pureTable.split("\n", 2);
        String headerString = wk[0];
        String dataString = wk[1];

        List<String> header = Arrays.asList(headerString.split("\t", -1));
        List<List<String>> dataList = Arrays.stream(dataString.split("\n"))
                .map(e -> Arrays.asList(e.split("\t", -1)))
                .collect(Collectors.toList());

        return new Pair<>(header, dataList);
    }

    /**
     * それぞれの列で、タイトルとデータの最大長を返す
     */
    private static List<Integer> maxLengthPerColumn(List<String> header, List<List<String>> dataList) {
        List<Integer> list = new ArrayList<>(header.size());
        for (ListIterator<String> ite = header.listIterator(); ite.hasNext(); ) {
            int colIndex = ite.nextIndex();
            ite.next(); // 空読み
            list.add(maxLength(header, dataList, colIndex));
        }
        return list;
    }

    /**
     * 任意の文字を最大幅まで埋める（左詰めの右埋め）
     */
    private static String fill(String value, int max, char fix) {
        int valueLength = strLength(value);

        int diff = max - valueLength;
        if (diff <= 0) {
            return value;
        }
        StringBuilder sb = new StringBuilder(value);
        for (int i = 0; i < diff; i++) {
            sb.append(fix);
        }
        return sb.toString();
    }

    private static List<List<String>> toMarkdownTable(List<String> header, List<List<String>> dataList, List<Integer> maxLengthPerColumn) {
        // List[row[col]]
        List<List<String>> table = IntStream.range(0, 2 + dataList.size())
                .boxed() // mapToObj(ignore -> new ArrayList<String>())ではだめらしい
                .map(ignore -> new ArrayList<String>())
                .collect(Collectors.toCollection(ArrayList::new));

        for (ListIterator<Integer> ite = maxLengthPerColumn.listIterator(); ite.hasNext(); ) {
            int colIndex = ite.nextIndex();
            int max = ite.next();
            String headerText = fill(get(header, colIndex), max, ' ');
            table.get(0).add(headerText);
            String separatorText = fill("", max, '-');
            table.get(1).add(separatorText);

            int index = 2;
            for (List<String> data : dataList) {
                String dataText = fill(get(data, colIndex), max, ' ');
                table.get(index).add(dataText);
                index++;
            }
        }
        return table;
    }


    private static String get(List<String> list, int index) {
        int size = list.size();
        if (0 <= index && index < size) {
            return list.get(index);
        }
        return "";
    }

    private static int maxLength(List<String> header, List<List<String>> dataList, int colIndex) {
        int max = strLength(get(header, colIndex));
        for (List<String> data : dataList) {
            max = Math.max(max, strLength(get(data, colIndex)));
        }
        return max;
    }

    private static int strLength(String str) {
        return str.getBytes().length; // 色々考慮するならCodepointとか…？
    }
}