Trino) session timeout 설정_hive.s3.socket-timeout

Data/Trino

Trino) session timeout 설정_hive.s3.socket-timeout

MightyTedKim 2023. 3. 15. 11:48

728x90

trino를 도입하려고 알아보는데, 여기는 특정 파티션을 찍어서 파티션 업데이트하는게 없고

전체를 선택해야하더라고요.

그러다보니 초기 세팅시에 데이터가 너무 많아 socket timeout exception이 발생했어요

복사를 하지 않아 비슷한 로그를 복사해왔습니다.

com.facebook.presto.spi.PrestoException: hive-metastore-server:9083:
java.net.SocketTimeoutException: Read timed out

아래 명령어를 실행해서 그런건데요

CALL system.sync_partition_metadata('test', 'hgkim', 'add', true);

그래서 socket timqeout을 늘려줫어요

kind: ConfigMap
apiVersion: v1
metadata:
  name: trino-configs
data:
  hive.properties: |-
    connector.name=hive-hadoop2
    hive.metastore.uri=thrift://hive-metastore.trino.svc.cluster.local:9083
    hive.allow-drop-table=true
    hive.max-partitions-per-scan=1000000
    hive.s3.endpoint=[s3 또는 object storage]
    hive.s3.aws-access-key=[s3 또는 object storage]
    hive.s3.aws-secret-key=[s3 또는 object storage]
    hive.s3.path-style-access=true
    hive.s3.ssl.enabled=false
    hive.s3.max-connections=100
    hive.s3.socket-timeout=5m

`hive.s3.socket-timeout` 의 기본값은 5초여서 너무 적더라고요

hive.s3.socket-timeout

TCP socket read timeout.

5 seconds

수정 후 적용하니 timout이 나지 않았어요

+ 각 엔진마다 조금씩 다른 명령어

각 쿼리엔진마다 파티션이나 메타데이터를 최신화하려면 특정 명령어를 실행해야해요

hive에서 파티션을 추가하려면

MSCK REPAIR TABLE test.hgkim

sparksql에서 metadata 업데이트하려면

REFRESH TABLE test.hgkim

trino에서는 업데이트하려면

CALL system.sync_partition_metadata('test', 'hgkim', 'add', true);

https://trino.io/docs/current/connector/hive.html

Hive connector — Trino 410 Documentation

Hive connector The Hive connector allows querying data stored in an Apache Hive data warehouse. Hive is a combination of three components: Data files in varying formats, that are typically stored in the Hadoop Distributed File System (HDFS) or in object st

trino.io

참고

https://groups.google.com/g/presto-users/c/euDfmyfXh4Y

https://stackoverflow.com/questions/69375479/connect-timeout-from-presto-trino-to-amazon-s3

728x90

저작자표시 (새창열림)

'Data > Trino' 카테고리의 다른 글

trino) mysql insert 에러 발생, 방화벽 실수_could not create connection (0)	2023.05.21
Trino) Deview 2023에 나왔던 JMX 모니터링 따라해보기 (0)	2023.03.12
Kubernetes)Trino설치_yaml (0)	2022.02.25

현재글Trino) session timeout 설정_hive.s3.socket-timeout

기록의 공간 :: mightytedkim