2

我正在探索使用 Elixir 快速导入混合类型(CSV、JSON)的 Postgres 数据。作为 Elixir 的新手,我正在遵循 youtube 视频“使用 Elixir 和 Postgrex 快速导入和导出 - Elixir Hex 包展示”(https://www.youtube.com/watch?v=YQyKRXCtq4s)中给出的示例。基本的混合应用程序一直工作到引入 Poolboy 为止,即 Postgrex 使用单个连接成功地将记录加载到数据库中。

当我尝试遵循 Poolboy 配置并通过运行对其进行测试时

FastIoWithPostgrex.import("./data_with_ids.txt")

在 iex 或命令行中,我收到以下错误,我无法确定原因(已删除用户名和密码):

** (UndefinedFunctionError) function DBConnection.Poolboy.child_spec/1 is 
undefined (module DBConnection.Poolboy is not available)
DBConnection.Poolboy.child_spec({Postgrex.Protocol, [types: 
Postgrex.DefaultTypes, name: :pg, pool: DBConnection.Poolboy, pool_size: 4, 
hostname: "localhost", port: 9000, username: "XXXX", password: 
"XXXX", database: "ASDDataAnalytics-DEV"]})
(db_connection) lib/db_connection.ex:383: DBConnection.start_link/2
(fast_io_with_postgrex) lib/fast_io_with_postgrex.ex:8: 
FastIoWithPostgrex.import/1

我在 Windows 10 上运行它,通过本地 SSH 隧道连接到 PostgreSQL 10.x 服务器。这是 lib/fast_io_with_postgrex.ex 文件:

defmodule FastIoWithPostgrex do
  @moduledoc """
  Documentation for FastIoWithPostgrex.
  """

  def import(filepath) do

    {:ok, pid} = Postgrex.start_link(name: :pg,
      pool: DBConnection.Poolboy,
      pool_size: 4,
      hostname: "localhost",
      port: 9000,
      username: "XXXX", password: "XXXX", database: "ASDDataAnalytics-DEV")

    File.stream!(filepath)
    |> Stream.map(fn line ->
        [id_str, word] = line |> String.trim |> String.split("\t", trim: true, parts: 2)
        {id, ""} = Integer.parse(id_str)
        [id, word]
    end)
    |> Stream.chunk_every(10_000, 10_000, [])
    |> Task.async_stream(fn word_rows ->
      Enum.each(word_rows, fn word_sql_params ->
        Postgrex.transaction(:pg, fn conn ->
          IO.inspect Postgrex.query!(conn, "INSERT INTO asdda_dataload.words (id, word) VALUES ($1, $2)", word_sql_params)
#        IO.inspect Postgrex.query!(pid, "INSERT INTO asdda_dataload.words (id, word) VALUES ($1, $2)", word_sql_params)    
        end , pool: DBConnection.Poolboy, pool_timeout: :infinity, timeout: :infinity) 
      end)
    end, timeout: :infinity)
    |> Stream.run

  end # def import(file)
end

这是 mix.exs 文件:

defmodule FastIoWithPostgrex.MixProject do
  use Mix.Project

  def project do
    [
      app: :fast_io_with_postgrex,
      version: "0.1.0",
      elixir: "~> 1.7",
      start_permanent: Mix.env() == :prod,
      deps: deps()
    ]
  end

  # Run "mix help compile.app" to learn about applications.
  def application do
    [
      extra_applications: [:logger, :poolboy, :connection]
    ]
  end

  # Run "mix help deps" to learn about dependencies.
  defp deps do
    [
      # {:dep_from_hexpm, "~> 0.3.0"},
      # {:dep_from_git, git: "https://github.com/elixir-lang/my_dep.git", 
tag: "0.1.0"},

      {:postgrex, "~>0.14.1"},
      {:poolboy, "~>1.5.1"}
    ]
  end
end

这是 config/config.exs 文件:

# This file is responsible for configuring your application
# and its dependencies with the aid of the Mix.Config module.

use Mix.Config

config :fast_io_with_postgrex, :postgrex,
  database: "ASDDataAnalytics-DEV",
  username: "XXXX",
  password: "XXXX",
  name: :pg,
  pool: DBConnection.Poolboy,
  pool_size: 4

# This configuration is loaded before any dependency and is restricted
# to this project. If another project depends on this project, this
# file won't be loaded nor affect the parent project. For this reason,
# if you want to provide default values for your application for
# 3rd-party users, it should be done in your "mix.exs" file.

# You can configure your application as:
#
#     config :fast_io_with_postgrex, key: :value
#
# and access this configuration in your application as:
#
#     Application.get_env(:fast_io_with_postgrex, :key)
#
# You can also configure a 3rd-party app:
#
#     config :logger, level: :info
#

# It is also possible to import configuration files, relative to this
# directory. For example, you can emulate configuration per environment
# by uncommenting the line below and defining dev.exs, test.exs and such.
# Configuration from the imported file will override the ones defined
# here (which is why it is important to import them last).
#
#     import_config "#{Mix.env()}.exs"

任何有关查找此错误原因的帮助将不胜感激!

4

2 回答 2

1

谢谢,根据您的建议,我通过降级 mix.exs 文件中的依赖版本并将依赖添加到早期版本的 db_connection 来获得原始示例:

   # Run "mix help deps" to learn about dependencies.
   defp deps do
     [
       # {:dep_from_hexpm, "~> 0.3.0"},
       # {:dep_from_git, git: "https://github.com/elixir-lang/my_dep.git", tag: "0.1.0"},

      {:postgrex, "0.13.5"},
      {:db_connection, "1.1.3"},
      {:poolboy, "~>1.5.1"}
 ]
 end

我还将尝试您在更高版本的 db_connection 中更改代码以用新的池管理器替换 Poolboy 的建议,看看是否也能正常工作。

我敢肯定,在架构更改方面有很多想法,但是我必须说,Poolboy 曾经如此受欢迎的原因很少,但在最新版本的 db_connection 中甚至不支持它作为连接类型。

于 2018-12-05T17:30:15.257 回答
1

我不想深入研究它是如何不起作用的,但是那个例子有点老了,poolboy 1.5.1你被拉到的deps.get2015 年.. 这个例子使用了 elixir 1.4

另外,如果你看到 Postgrex 的mix.exsdeps,你会注意到你新安装的 lib (1.14) 依赖于elixir_ecto/db_connection2.x

您所指的代码使用 Postgres 1.13.x,它依赖于{:db_connection, "~> 1.1"}. 所以我希望不兼容。

我会使用您在示例代码mix.lock文件中看到的库版本,如果我想看到它的工作原理,我会使用 elixir 版本。

也许首先尝试将 Postgrex 版本降低到那个时候(可能在 0.12.2 和示例的锁定版本之间)。

另外,elixir的版本可能在这里有一些发挥,检查这个

问候!

  • 玩得开心

编辑:

您可以使用DBConnection.ConnectionPool而不是 poolboy 并使用最新postgrex版本和 elixir 版本,不确定性能差异但您可以比较,只需执行以下操作:

on config/config.exs(检查您是否需要密码等)

config :fast_io_with_postgrex, :postgrex,
  database: "fp",
  name: :pg,
  pool: DBConnection.ConnectionPool,
  pool_size: 4

并将lib/fast_io_with.....ex这两Postgrex.start_link(...行替换为:

{:ok, pid} = Application.get_env(:fast_io_with_postgrex, :postgrex)
          |> Postgrex.start_link

这给了我:

mix run -e 'FastIoWithPostgrex.import("./data_with_ids.txt")'
1.76s user 0.69s system 106% cpu 2.294 total

在 Postgrex 0.14.1 和 Elixir 1.7.3 上

于 2018-12-04T23:39:25.733 回答