问题标签 [foundry-code-repositories]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

104 问题

0 投票

1 回答

188 浏览

palantir-foundry - 当我的数据规模较小时，如何在 Palantir Foundry 转换中获得更好的性能？

我的数据集大小都在 1GB 以下，而我的转换的总输出大小在 1GB 以下。我注意到我的工作簿构建对于我期望的数据规模来说非常慢，我想知道我可以转向哪些“拨号”来优化这些。

例如，我在构建的 Spark 详细信息中看到，我的几个阶段有 200 个任务，每个任务只获取几 KB 的数据。那正确吗？

palantir-foundry foundry-code-repositories foundry-code-workbooks

2022-01-20T19:37:03.250

0 投票

1 回答

123 浏览

palantir-foundry - 我如何知道我的 Foundry Job 正在使用 AQE？

我有时听到人们提到这个 AQE 功能，我想知道如何验证我的工作是否正在使用它。我正在代码存储库和代码工作簿中运行转换。

palantir-foundry foundry-code-repositories foundry-code-workbooks

2022-01-20T20:19:57.030

0 投票

1 回答

71 浏览

palantir-foundry - 如何访问从转换上传到文件夹的文件？

我将一个图像文件上传到 Foundry 的一个文件夹中，我想将它用作转换的输入。看起来它作为某种资源存储在名为 Blobster 的服务中，我怎样才能访问该文件并使用它？

palantir-foundry foundry-code-repositories foundry-code-workbooks

2022-01-21T15:47:23.803

0 投票

1 回答

117 浏览

apache-spark - 由于 Executor 丢失，Shuffle 阶段失败

当我的 spark 作业失败时，我收到以下错误**"org.apache.spark.shuffle.FetchFailedException: The relative remote executor(Id: 21), which maintains the block data to fetch is dead."**

我的 Spark 工作概览

输入大小约为 35 GB

我已经广播将所有较小的表与母表连接成 a dataframe1，然后我将每个大表加盐，然后再dataframe1加入dataframe1（左表）。

使用的配置文件：

使用上述方法和配置文件，我能够将运行时间降低 50%，但由于 Executor Loss问题，我仍然遇到 Shuffle Stage Failing。

有什么办法可以解决这个问题吗？

apache-spark palantir-foundry foundry-code-repositories foundry-python-transform

2022-01-26T12:38:18.200

0 投票

1 回答

66 浏览

palantir-foundry - PALANTIR-FOUNDRY：我可以传播列描述吗？

我有一个转换管道。可以传播管道下游给定列的描述吗？

这样人们就可以在上游添加描述，然后自动向下游传播。

palantir-foundry foundry-code-repositories

2022-01-27T15:10:40.607

0 投票

1 回答

99 浏览

apache-spark - How do I make my Spark job run faster using executors?

I know my code is free from antipatterns since I don't have any warnings in my Authoring code editor, so I know my code is doing PySpark operations that are distributed and scalable.

My current job has 2 executors assigned to it with 2 cores each, and it runs with task parallelism of 16 as seen on the Spark Details page.

How do I make this job run faster?

apache-spark palantir-foundry foundry-code-repositories foundry-code-workbooks

2022-01-31T17:44:38.173

0 投票

1 回答

43 浏览