由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
JobHunting版 - 烙印这是黑我还是问了个高深问题?
相关主题
惨了,background check怎么圆谎?烙印是神一样的存在
看了那个被烙印穿小鞋的帖子下周要面试一个烙印,出什么题让他fail?
Embrassed Bloomberg 电面狗家店面 被考设计
发个面经吧[Data Scientist] (转载)关于蚂蚣面试,忍不住说几句
非常复杂的身份问题看看大家有木有人知道怎么做好我们公司一个老中就是典型的作死的节奏
我们组终于来了个烙印同事烙印确实不要脸啊,乱写linkedin
面试遇到一个老印,挂的可能性有多大?同族裔高管高官靠不住 (转载)
请教大家两道FB和Amazon的“奇怪”设计题被烙印坑了,求拯救
相关话题的讨论汇总
话题: 烙印话题: suppose话题: 打断话题: cpu
进入JobHunting版参与讨论
1 (共1页)
k****t
发帖数: 184
1
烙印: if the production application hang, how do you find out what caused
the problem?
我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
stop it.")
我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
")
我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
of I/O blocking.")
我: That's my way to analyze issue, I will rule out something to ...
(烙印打断 "ok, let's suppose it's I/O issue, but it's million line code
application, and could be thousands of part involves I/O, how do you solve
the problem?)
我: For this huge application, troubleshooting needs deep understanding of
the codes.
(烙印打断: suppose you know the code very well, and suppose you wrote the
code yourself.)
我沉默,烙印沉默,(我在想:他一定是想要一个明确的答案,也就是一句话一针见血
的回答,可我没有答案...陷入长考)
一分钟后
烙印: it's ok, let's move on to next question.
我现在也想不出答案,请高手指点
谢谢!
w**a
发帖数: 487
2
新手:这个application有log么? 能不能在程序还在运行的时候就查看log呢?

up.
because

【在 k****t 的大作中提到】
: 烙印: if the production application hang, how do you find out what caused
: the problem?
: 我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
: stop it.")
: 我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
: 我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
: ")
: 我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
: of I/O blocking.")
: 我: That's my way to analyze issue, I will rule out something to ...

k****t
发帖数: 184
3
这个我没想到,所以当时没问。

【在 w**a 的大作中提到】
: 新手:这个application有log么? 能不能在程序还在运行的时候就查看log呢?
:
: up.
: because

h*******9
发帖数: 46
4
Check logs... 如果可以的话 可以check database. 如果是service 还可以check
monitor. 一般的service 都应该有 logs and monitor services. 不过说实话 如果是
一个 application。 你说的其实都没有问题。 因为application 一般都不存在说不能
暂停的情况。 烙印有意或者无意的 说成application吧
g***s
发帖数: 3811
5
log当然是首选,但大部分情况估计看不出hang的问题;
thread dump 是最先应该考虑的。我估计这是他需要的答案. kill -3 $pid
g*****g
发帖数: 34805
6
I would set up metrics to cover the frequent API calls for both volume and
latency. I would have the metrics logged to a separate server and displayed
on a timeline chart, and alerts to warn me
if the volume/latency is over certain threshold compared to history. I
would even set up circuit breaker if self recovery is possible. It
should be pretty easy to narrow down which call is causing trouble. there
are open source tools on all these.
The key is to prepare, not react on such accidents. If there is a number you
want to know when it hangs, you should build it before hand.

up.
because

【在 k****t 的大作中提到】
: 烙印: if the production application hang, how do you find out what caused
: the problem?
: 我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
: stop it.")
: 我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
: 我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
: ")
: 我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
: of I/O blocking.")
: 我: That's my way to analyze issue, I will rule out something to ...

f******o
发帖数: 102
7
他既然说了suppose you wrote the code and know the code very well, 那就是从代
码角度入手。 看从什么时候出现问题, 然后找出culprit change list, revert that
change list or revert the deployment.
c***d
发帖数: 996
8
已经说是app server IO了。。
这个东西其实更适合sre, 即使你对code一点不知道, 放一个io busy的production
server, 你也应该30分钟内找到那个有问题的函数。

displayed
you

【在 g*****g 的大作中提到】
: I would set up metrics to cover the frequent API calls for both volume and
: latency. I would have the metrics logged to a separate server and displayed
: on a timeline chart, and alerts to warn me
: if the volume/latency is over certain threshold compared to history. I
: would even set up circuit breaker if self recovery is possible. It
: should be pretty easy to narrow down which call is causing trouble. there
: are open source tools on all these.
: The key is to prepare, not react on such accidents. If there is a number you
: want to know when it hangs, you should build it before hand.
:

w***x
发帖数: 105
9
这烙印明显没写过程序
g*****g
发帖数: 34805
10
真到hang了除了kill -3看threads都在干啥没啥好弄的。thread都被吃掉了没啥log都
不奇怪。

【在 c***d 的大作中提到】
: 已经说是app server IO了。。
: 这个东西其实更适合sre, 即使你对code一点不知道, 放一个io busy的production
: server, 你也应该30分钟内找到那个有问题的函数。
:
: displayed
: you

相关主题
我们组终于来了个烙印同事烙印是神一样的存在
面试遇到一个老印,挂的可能性有多大?下周要面试一个烙印,出什么题让他fail?
请教大家两道FB和Amazon的“奇怪”设计题狗家店面 被考设计
进入JobHunting版参与讨论
p*****y
发帖数: 529
11
this is not a pure technical question. By stretching you in a "rude" way, he
tried to find out how you perform under pressure and whether you can keep
calm and manage the conversation going even if the other party is not, which
is very typical when you are in a real production support scenario. At
least, this is how I typically use those kind of questions.

up.
because

【在 k****t 的大作中提到】
: 烙印: if the production application hang, how do you find out what caused
: the problem?
: 我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
: stop it.")
: 我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
: 我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
: ")
: 我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
: of I/O blocking.")
: 我: That's my way to analyze issue, I will rule out something to ...

J****n
发帖数: 937
12
这种问题很无聊,解决的方法有很多种,要根据实际情况选择。面试的人脑袋里就想着
一个答案,或者就知道一个答案,你不选他的答案就是不对。这是一种非常傻X的面试
方法,绝大多数情况下显示面试的人根本不懂他自己问的问题,或者就只知道一个答案。

up.
because

【在 k****t 的大作中提到】
: 烙印: if the production application hang, how do you find out what caused
: the problem?
: 我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
: stop it.")
: 我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
: 我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
: ")
: 我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
: of I/O blocking.")
: 我: That's my way to analyze issue, I will rule out something to ...

a**n
发帖数: 313
13
cpu没有用完或甚至没有单个cpu100%, 估计要印想要问你,用jstack 或kill3,
jvisual, 或别的profile tool去attach到那个process看有无deadlock之类.
老印估计是customer support person. 这个是经验问题,所以还是有点黑你。
不过jmap heap dump will not stop process, 所以老印自己也不懂。

【在 g*****g 的大作中提到】
: 真到hang了除了kill -3看threads都在干啥没啥好弄的。thread都被吃掉了没啥log都
: 不奇怪。

b********n
发帖数: 5997
14
you should say, 'suppose you shut yr f**k up, everything will be fine.'

up.
because

【在 k****t 的大作中提到】
: 烙印: if the production application hang, how do you find out what caused
: the problem?
: 我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
: stop it.")
: 我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
: 我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
: ")
: 我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
: of I/O blocking.")
: 我: That's my way to analyze issue, I will rule out something to ...

b******l
发帖数: 860
15
这个是正解。凡是甩脸子恼羞成怒的都应该面壁。你要是在生产环境中碰到outage,
director/vp都在线上抓狂的话就知道对付这样的问题有多么司空见惯了。

he
which

【在 p*****y 的大作中提到】
: this is not a pure technical question. By stretching you in a "rude" way, he
: tried to find out how you perform under pressure and whether you can keep
: calm and manage the conversation going even if the other party is not, which
: is very typical when you are in a real production support scenario. At
: least, this is how I typically use those kind of questions.
:
: up.
: because

w***x
发帖数: 105
16
对程序员来说,最常规的回答就是gdb上去,弄个core dump出来慢慢研究...
感觉问这种傻问题的,不是没写过程序就是估计找茬,都不是的话,就是神经病
s*******e
发帖数: 1630
17
都说假设是你自己的code,你就说自己怎么写instrumentation来帮助live site debug
啊,如果他说假如你没logging,你就说我写prod codes一定有logging,否则不是合格
的prod codes
b*********r
发帖数: 651
18
你应该让他把他的suppose都一次性说出来...
v*****1
发帖数: 2200
19
完全不懂,但肯定是黑你,不要抱有幻想

up.
because

【在 k****t 的大作中提到】
: 烙印: if the production application hang, how do you find out what caused
: the problem?
: 我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
: stop it.")
: 我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
: 我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
: ")
: 我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
: of I/O blocking.")
: 我: That's my way to analyze issue, I will rule out something to ...

w********s
发帖数: 1570
20
ptrace, strace
procfs 里查status, locks, context switches
你看上去缺乏实践,只刷题了?

up.
because

【在 k****t 的大作中提到】
: 烙印: if the production application hang, how do you find out what caused
: the problem?
: 我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
: stop it.")
: 我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
: 我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
: ")
: 我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
: of I/O blocking.")
: 我: That's my way to analyze issue, I will rule out something to ...

相关主题
关于蚂蚣面试,忍不住说几句同族裔高管高官靠不住 (转载)
我们公司一个老中就是典型的作死的节奏被烙印坑了,求拯救
烙印确实不要脸啊,乱写linkedinIT业要不准招烙印的话那就根本就招不到人了 (转载)
进入JobHunting版参与讨论
w********s
发帖数: 1570
21
他会告诉你prod里的东西不能随便kill,log是第一个能看的。
如果你能kill,何不gdb attach上?
他的含义是prod里没有gdb

【在 g***s 的大作中提到】
: log当然是首选,但大部分情况估计看不出hang的问题;
: thread dump 是最先应该考虑的。我估计这是他需要的答案. kill -3 $pid

w********s
发帖数: 1570
22
人家说的是prod,不是qa环境你可以随便折腾。

displayed
you

【在 g*****g 的大作中提到】
: I would set up metrics to cover the frequent API calls for both volume and
: latency. I would have the metrics logged to a separate server and displayed
: on a timeline chart, and alerts to warn me
: if the volume/latency is over certain threshold compared to history. I
: would even set up circuit breaker if self recovery is possible. It
: should be pretty easy to narrow down which call is causing trouble. there
: are open source tools on all these.
: The key is to prepare, not react on such accidents. If there is a number you
: want to know when it hangs, you should build it before hand.
:

w********s
发帖数: 1570
23
这个就是个技术问题,来区分刷题的,还是有经验的
实际上,这个问题很能看出你的水平有多少
nb点的你可以根据procfs和ptrace模拟出一个类似gdb

he
which

【在 p*****y 的大作中提到】
: this is not a pure technical question. By stretching you in a "rude" way, he
: tried to find out how you perform under pressure and whether you can keep
: calm and manage the conversation going even if the other party is not, which
: is very typical when you are in a real production support scenario. At
: least, this is how I typically use those kind of questions.
:
: up.
: because

g*****g
发帖数: 34805
24
我说的当然是prod的做法。

【在 w********s 的大作中提到】
: 人家说的是prod,不是qa环境你可以随便折腾。
:
: displayed
: you

g*****g
发帖数: 34805
25
莫非你以为kill -3是杀进程?

【在 w********s 的大作中提到】
: 他会告诉你prod里的东西不能随便kill,log是第一个能看的。
: 如果你能kill,何不gdb attach上?
: 他的含义是prod里没有gdb

d****n
发帖数: 1637
26
我觉得楼主回答的已经很专业了。烙印可能想要些三角猫的功夫。
先看disk io
iostat?
在看database
再看network
netstat?
确定是那个问题,如果是设计问题, 再回到kill -3, core dump -> gdb.
但是话说回来, 如果production 没有楼主提供的那些方法,真他妈叫狗屎prod,是来
给人擦腚吧。
上production最好要有system monitor(appdynamics.com)之类的服务。
没有的话就suppose 没这个,没那个吧,哈哈
c***z
发帖数: 6348
27
Exactly, the right answer is "it depends".
Also, it is very rude to interrupt people, you are probably stabbed.

案。

【在 J****n 的大作中提到】
: 这种问题很无聊,解决的方法有很多种,要根据实际情况选择。面试的人脑袋里就想着
: 一个答案,或者就知道一个答案,你不选他的答案就是不对。这是一种非常傻X的面试
: 方法,绝大多数情况下显示面试的人根本不懂他自己问的问题,或者就只知道一个答案。
:
: up.
: because

k**0
发帖数: 19737
28
program log + email notification. Also setup server notification in case
program hangs.
对付这种只会嘴的阿三不需要从技术detail上说问题。
技术员工的最大问题就是太技术, 想向上发展一定要会看人说话。

up.
because

【在 k****t 的大作中提到】
: 烙印: if the production application hang, how do you find out what caused
: the problem?
: 我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
: stop it.")
: 我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
: 我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
: ")
: 我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
: of I/O blocking.")
: 我: That's my way to analyze issue, I will rule out something to ...

a****l
发帖数: 8211
29
这个第二段是正解。

displayed
you

【在 g*****g 的大作中提到】
: I would set up metrics to cover the frequent API calls for both volume and
: latency. I would have the metrics logged to a separate server and displayed
: on a timeline chart, and alerts to warn me
: if the volume/latency is over certain threshold compared to history. I
: would even set up circuit breaker if self recovery is possible. It
: should be pretty easy to narrow down which call is causing trouble. there
: are open source tools on all these.
: The key is to prepare, not react on such accidents. If there is a number you
: want to know when it hangs, you should build it before hand.
:

j******o
发帖数: 4219
30
这种问题每个系统和程序都有不同的回答,谈到具体怎么做就是扯淡,你就知道你的系
统一定有kill -3?
具体要怎么做在设计阶段就已经决定了,log和deamon是比较普遍的做法。
相关主题
感觉在公司里被烙印领导搞得越来越边缘化了,求支招看了那个被烙印穿小鞋的帖子
手下一个员工真suck, 怎么赶走?Embrassed Bloomberg 电面
惨了,background check怎么圆谎?发个面经吧[Data Scientist] (转载)
进入JobHunting版参与讨论
l*********u
发帖数: 19053
31
对code很熟的话,就应该知道app做哪几件事。按顺序查,很快就可以查出hang在哪里。

up.
because

【在 k****t 的大作中提到】
: 烙印: if the production application hang, how do you find out what caused
: the problem?
: 我: I will dump heap to ... (烙印打断: "suppose it's production, you can't
: stop it.")
: 我: I check CPU, if CPU is busy... (烙印打断: "suppose CPU is not busy.")
: 我: I will check memory usage, if ...(烙印打断: "suppose not memory used up.
: ")
: 我: I will check if there is I/O blocking ...(烙印打断: "suppose not because
: of I/O blocking.")
: 我: That's my way to analyze issue, I will rule out something to ...

b*******e
发帖数: 4483
32
他要问的就是这个,你没答对哈

【在 k****t 的大作中提到】
: 这个我没想到,所以当时没问。
i****k
发帖数: 668
33
可是你咋知道一定是Java呢...万一前任悄悄地handle了它咋办呢

【在 g*****g 的大作中提到】
: 莫非你以为kill -3是杀进程?
f*******s
发帖数: 182
34
Dump stack trace. It might be you have an infinite loop in code.
1 (共1页)
进入JobHunting版参与讨论
相关主题
被烙印坑了,求拯救非常复杂的身份问题看看大家有木有人知道怎么做好
IT业要不准招烙印的话那就根本就招不到人了 (转载)我们组终于来了个烙印同事
感觉在公司里被烙印领导搞得越来越边缘化了,求支招面试遇到一个老印,挂的可能性有多大?
手下一个员工真suck, 怎么赶走?请教大家两道FB和Amazon的“奇怪”设计题
惨了,background check怎么圆谎?烙印是神一样的存在
看了那个被烙印穿小鞋的帖子下周要面试一个烙印,出什么题让他fail?
Embrassed Bloomberg 电面狗家店面 被考设计
发个面经吧[Data Scientist] (转载)关于蚂蚣面试,忍不住说几句
相关话题的讨论汇总
话题: 烙印话题: suppose话题: 打断话题: cpu