第2页 - 关于sorting的讨论汇总 - 话题女王

全部话题 - 话题: sorting

m******9
发帖数: 968

来自主题: JobHunting版 - 一个特别的inplace merge two sorted arrays

先用scan的方法，比如用2 pointer都从A B的左端开始scan，将smaller numbers都
swap到A
中，然后再sort B，例如用heap sort， quick sort
整体上是O(nlgn)

rest
O(nlogn

M******q
发帖数: 94

来自主题: JobHunting版 - C++ Q41: std::sort (C4)

sort() takes RandomAccessIterator, which is just a template parameter name.
As long as the argument of sort() support what RandomAccessIterator is supp
osed to support, sort() should work. Quotes from "thinking in c++ volume 2
":
RandomAccessIterator. This type of iterator supports all the operations that
a regular pointer does: you can add and subtract integral values to move it
forward and backward by jumps (rather than just one element at a time), you
can subscript it with operator[ ], you

n**z
发帖数: 155

来自主题: JobHunting版 - c/c++ qsort/sort 来 sort array of string

static void sort(Object[] a)
Sorts the specified array of objects into ascending order,
according to the natural ordering of its elements.

c*******n
发帖数: 63

来自主题: JobHunting版 - c/c++ qsort/sort 来 sort array of string

只知道STL Algorithm里有sort，不知道有qsort
ps: sort适合有RandomAccessIterator的container

s*****y
发帖数: 897

来自主题: JobHunting版 - You have 100 files, each containing 10G sorted integers. 求解?

为什么一般用counting sort好呢，我以为一般都用merge sort。
counting sort的话你要知道所有数字的range 和ram的大小，能够把所有的counter
hold在ram里面才行。

m**q
发帖数: 189

来自主题: JobHunting版 - Find the K largest element in a sorted M*N array

Looks like an old question, but failed to find a better solution than O(klg(
m+n)). Any ideas?
The question is quoted from a comment on ihas1337code website
------
1.Two reverse sorted arrays A and B have been given. such that size of A is
m and size of B is n
You need to find the k th largest sum (a+b) where a is taken from A and b is
taken from B. such that k < m*n It can be done in O(n*n) but i believe it
can be done in a very efficient way. I heard someone say there is a paper
from SODA. I... 阅读全帖

f****4
发帖数: 1359

来自主题: JobHunting版 - google老题：Find kth largest of sum of elements in 2 sorted array

找到一个对的O(N)的解法
http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=ri
This is a pretty good puzzle. You can actually find the cutoff and a
description of exactly which pairs are in the solution in less than O(N)
time, but outputting all the solutions takes O(N) time, so it doesn't help
you overall. This won't be much of a hint, but finding the cutoff point can
be done in O(sqrt(N)*log2N) time. Forgive me for not writing code, but this
gets pretty nitty-gritty in spots.
I think of the pro... 阅读全帖

f****4
发帖数: 1359

来自主题: JobHunting版 - google老题：Find kth largest of sum of elements in 2 sorted array

W***u
发帖数: 81

来自主题: JobHunting版 - 问一个 String array sorting 的题。

原题是,Write a method to sort an array of strings so that all the anagrams
are next to each other.
看到的靠谱的方法是先 sort 每一个string，然后给每一个sorted后的string
generate 一个key 然后用hashtable 来找到 anagrams的String. 我的问题是这个hash
table 如何生成，能够保证，解决conflicts，并且保证O(1)的查找呢？
本人菜鸟，对hashtable的理解还在寻找hash function 然后每一个key 对应一个
value的层次。简单的就是在知道key的范围的时候用数组实现，当不知道key值范围的
时候就不知道怎么搞了？看到C++里的hashtable 实现都是按照pair 插
入进去的，真心不知道这样的查找怎么能够保证O(1).
求各位前辈高人解疑释惑。拜谢！

d********t
发帖数: 51

来自主题: JobHunting版 - 请教一个题: Median of Two Sorted Arrays

Median of Two Sorted Arrays
There are two sorted arrays A and B of size m and n respectively. Find the
median of the two sorted arrays. The overall run time complexity should be O
(log (m+n)).
解答如下:
问题1> max(0, (m-n)/2)和min(m-1, (m+n)/2)用处是什么
问题2> 这个解答的总体思路是什么
多谢了
double findMedianSortedArrays2(int A[], int m, int B[], int n) {
return findMedianHelper2(A, m, B, n, max(0, (m-n)/2), min(m-1, (m+n)
/2));
};
double findMedianHelper(const int A[], const int m, const int B[], const
int... 阅读全帖

s*****l
发帖数: 9

来自主题: JobHunting版 - 问个CareerCup上external sort的题

第四版10.4
Imagine you have a 20GB file with one string per line. Explain how you
would sort the file.
答案是external sort,先分若干chunk,把每个chunk放到memory里sort后放回去,然后再
merge.
我的疑问是
1）答案是one chunk by one chunk 的merge,这是不是代表不需要其他空间来merge,就
是速度慢了点，假设memory的size是1GB，是不是代表merge 20次？
2）如果采取20个一起merge的话，是不是系统还要再给个20GB的disk的临时空间,以便
存merge后的结果再放回去?
谢谢啦

z****p
发帖数: 18

来自主题: JobHunting版 - 问题：Find the minimum number of "swaps" needed to sort an array

I guess this problem is not about algorithm. It asks for the number of min
swaps. It's more like a math problem. As long as we give N-K, our job is
done, and we don't have to implement any algorithm.
Think this problem in another way:
-There is a large graph, where each node is a permutation of the given list
of data; there is an edge points from node i to node j if we can move from
permutation-i to permutation-j by using one "swap".
-Now the question is I pick two nodes i and k in the graph, wh... 阅读全帖

b*******e
发帖数: 217

来自主题: JobHunting版 - 再论 mini # of swaps to sort array.

incorrect.
see example 3214
first round (1) 32 needs swap
get 1432
second round (2) 4 need to swap
get 1324
not sorted.
I guess needs to use some advanced data structure like stack....
keep tracking the local min.... then pop us the elements larger than the
local min and insert them to the end one by one.
(1) push 32 into stack.
(2) when hit 1, see 1 is local min
(3) pop up 2,3 and insert them to the back of the array
(4) now array is 1423.
(5) the array pointer now go to 1 (note we can track th... 阅读全帖

l*******m
发帖数: 1096

来自主题: JobHunting版 - heap sort的缺点是什么？和quick sort比

quick sort 快，因为cpu cache friendly。搞搞数值算法就知道现在CPU的运算单元一
般都不会跑满，CPU io花好多时间，由于heap操作，index是跳来跳去的，cache
reloading太多。而quick sort， index 是滑动的，所以overhead低

f****p
发帖数: 18483

来自主题: JobHunting版 - heap sort的缺点是什么？和quick sort比

我一直在说是平均情况，你自己去看。那个O(n^2)是在倒序的情况。你自己随便用个数
据你就比较一下就知道了，在实际情况下，都差不多是随机的，quick sort快的不是一
点半点。这个和architecture什么基本没毛关系。很多时候，重要的是不是in-place的
排序。特别是大的数据库或者storage的时候，所以有时那个只用merge sort。

stable

s***5
发帖数: 2136

来自主题: JobHunting版 - heap sort的缺点是什么？和quick sort比

那用randomize数组啊？每次randomly选一个pivot index就可以保证不会到O(n^2)了。
只要不是在distributed cluster上，quick sort应该最快的。在cluster上用merge
sort最方便了。

stable

d******l
发帖数: 98

来自主题: JobHunting版 - heap sort的缺点是什么？和quick sort比

In my opinion, quicksort() is better because the CPU cache thing, has better
locality of reference -- the next thing to be accessed is usually close in
memory to the thing you just looked at. In java, java.util.Arrays.sort()
uses quicksort(), while java.util.Collections.sort() use mergesort().
As for heapsort(), it doesn't build a real heap(tree-heap), it is array/
arraylist in fact, through the index jumping, give us an illusion of a heap.
By the way, bless for myself recent onsite. ^_^

c********p
发帖数: 1969

来自主题: JobHunting版 - 请教一下external sorting的问题

怎么会是4次啊？
比如array 1 和array2 比较了之后 array 1里放的都是比array2小的，
然后array 3再和array1比较，array1里边放的是比array3小的，但未必array 3放的是
比array 2小的。。。
这样下去，，要比较多少次？
另外，又想起来一个问题，怎样in place的sort两个sorted array啊，就算是merge
sort也不是in place的。

y*****3
发帖数: 451

来自主题: JobHunting版 - Convert Sorted List to BST那道题

Convert Sorted List to BST那道题的O(N)解法看了一整天了还没看懂，用java写的话
，为什么head = head.next 那个赋值不行啊？？为什么必须要手动修改head的值啊？
还有，如果不考虑时间复杂度的话，就单纯从技术上讲，这种bottom to top的方法
build bst应该完全可以应用在sorted array那道题上吧？可为什么我sorted array那
道题这样build tree就不成呢？？看来还是没搞明白head=head.next那句话到底是怎么
回事。请哪位大牛给讲解一下吧！实在是想破了头了，多谢了！！

r****c
发帖数: 2585

来自主题: JobHunting版 - sort list java solution

merge sort, recursion is pretty straightforward, sort first half and sort
the second half.
notice it is linked list so easy to insert. You only change the next pointer
use O(1) space

p**o
发帖数: 3409

来自主题: JobHunting版 - 请教iterative merge sort list的代码

def mergesort_topdown (aList):
""" Top-down merge sort.
A total of logn splits;
in each split we merge n items.
So the complexity is O(n logn).
"""
_splitmerge (aList, 0, len(aList))
def _splitmerge (aList, i, j):
""" Recursively split runs into two halves
until run size == 1 (considered sorted), then
merge two halves and return back up the call chain.
"""
if j - i > 1:
m = (i + j) // 2
_splitmerge (aList, i, m)
... 阅读全帖

b**********5
发帖数: 7881

来自主题: JobHunting版 - 这题应该用bucket sort还是counting sort

counting sort不就是bucket sort的一种specialised？ bucket size为一。。。
这种东西，我面试时，会问interviewer，他要哪种。。。。

people

p******r
发帖数: 122

来自主题: JobHunting版 - 如何让python dictionary sorting 的速度变得很快？ (转载)

【以下文字转载自 Programming 讨论区】
发信人: prettier (I know), 信区: Programming
标题: 如何让python dictionary sorting 的速度变得很快？
发信站: BBS 未名空间站 (Fri Aug 26 23:19:43 2016, 美东)
如何让python dictionary sorting 的速度变得很快？
要sort 一些 python dictionary , 找出 value 最大的key, 一个 dictionary 的长度
大概有 300,000 。然后发现速度奇慢。真的不是一点的慢。
代码用的这个：
max(stats.iteritems(), key=operator.itemgetter(1))[0]
问一下怎样能 speed up?

j*****k
发帖数: 1198

来自主题: JobHunting版 - how to sort strings if alpha order is changed

I did not notice the detail how each language implement string sort, just
use it. Now, if the alpha order is changed, a language's builtin sort
function does not work. How do we implement the sort function based on the
changed alpha order?

p******r
发帖数: 122

来自主题: SanFrancisco版 - 如何让python dictionary sorting 的速度变得很快？ (转载)

c******n
发帖数: 4965

来自主题: THU版 - parsing bibliography and sorting

my wife was spending a lot of time sorting the bibliography of her thesis,
because her bib was obtained in plain text form, I have to parse out the
first author last name first. so I wrote this little piece of code. hope it
will be useful for someone too....
right now it fails to parse single-author bib, cuz it's difficult to
recognize a human name from other words. but for biology papers, a paper
mostly has multiple authors
sub get_first_author($) {
my ($line) = @_;
my ($author, $secon... 阅读全帖

m*******a
发帖数: 192

来自主题: CS版 - sorting问题求教。 (转载)

【以下文字转载自 SanFrancisco 讨论区】
发信人: maladonaa (hello), 信区: SanFrancisco
标题: sorting问题求教。
发信站: BBS 未名空间站 (Thu Oct 14 20:27:51 2010, 美东)
If I have N numbers
and I want to find out the top X of them
and sort the top X from largest to smallest,
what kind of sorting algorithm would you recommend? thanks

l********o
发帖数: 89

来自主题: CS版 - sorting问题求教。 (转载)

不需要用什么quick sort。就最简单的selection sort就行了。
扫第一遍找出N里最大的，再扫一遍找剩下N-1里面最大的。扫X遍就完了。
Quick sort之类的是如果你要把整个序列排序的算法。你找Top X个只需要扫前X遍算
法就可以
停了。这个在N>>X的情况下要比什么quicksort快得多。

f*******w
发帖数: 1243

来自主题: CS版 - 问个sorting相关的题 (转载)

【以下文字转载自 JobHunting 讨论区】
发信人: fenghaolw (生如夏花), 信区: JobHunting
标题: 问个sorting相关的题
发信站: BBS 未名空间站 (Thu Sep 20 19:58:35 2012, 美东)
已知7个数，关系如下
x1 > x2 > x4
x2 > x5
x1 > x3 > x6
x3 > x7
问怎么排序是最优的，最优的方法worst case需要多少次比较
我是用merge sort的想法，先merge
x2, x4 和 x2, x5
需要1次比较 (x4 和 x5)
然后merge
x3, x6 和 x3, x7
需要1次比较 (x6 和 x7)
然后merge
x2, x4, x5 和 x3, x6, x7
需要5次比较（merge sort：3+3-1次比较）
所以总共是7次比较
但是不知道怎么证明是最优的？或者这样不是最优的？

s*k
发帖数: 144

来自主题: Database版 - SORTING 中文 IN ORACLE

In fact binary sort is a sort according to the binary
value of the double bytes character. If you are dissatisfied
with the result, I think you may have to sort by external
programs or pl/sql package of you own.
By far, I don't know where you can get the package.

s*****w
发帖数: 215

来自主题: DotNet版 - 问一个Array Sort的问题

因为平常处理大量数据的时候都是import到database里面的。所以没有想过这个问题。
最近遇到一个问题，是关于Array Sort的。
Array有Array.Sort(）方法，看了msdn说他用了Quick Sort Algorithem. The average
the method is O(nlogn) operation, the worst case it is an O(n^2) operation.
这是不是意味着，我们处理这些数据的时候没有必要自己写binary tree的data struct
ure，（C#里面没有BinaryTree的data type)，因为BTree是O(nlogn) operation. 如果
自己写BTree，好处在哪里？

b******y
发帖数: 9224

来自主题: Java版 - Sort TreeMap by value?

lz好像是说，如何sort value, 而不是sort key. 你说的是key是sorted.

c*******g
发帖数: 1996

来自主题: Linux版 - 请教sort

sort -t~ -k3,3n -o test.sorted test.sort
这个排序按第二个~后面的字符串用数字排序怎么排的不对呢,似乎用的第一个~前面的
index
1~7~7#0/1
1~7~7#0/1
1~7~7#0/1
1~7~7#0/1
1~8201150~7#0/2
2~7~7#0/1
2~8201150~7#0/2
2~8201150~7#0/2
2~8201150~7#0/2

S*A
发帖数: 7142

来自主题: Linux版 - 为什么sort 的速度在单个server比cluster上快很多

cluster 上面的 sort 是如何实现的？如果你用 MPI 那种
经常传东西的当然慢了。
cluster 之间传数据和本地访问内存的延时比较大。
对于数据有很多相关依赖性的 cluster 不适合。
cluster 适合数据可以被切割成小块，在小块里面可以独立运算的。
这个 sort 比较适合用 map reduce.
把 80G 分成很多小 N 份，给 N 个节点分别 sort.
然后 merge 就是 O(n) 的。
网上一发一收 80G 就是 160G 数据，这个就要花很多时间。
所以单机没准还快点。

l**********g
发帖数: 503

来自主题: Linux版 - 请教sort

我想按照：p后面的值来sort数据，我现在用下面指令，
sort -t'>' -k 2，2n
只能做字符顺序sort，如何能让第2行出现在第3行呢？谢谢！
{:rc=>0, :p=>4, :l=>1, :st=>"FLORIDA"}
{:rc=>0, :p=>39, :l=>1, :st=>"MICHIGAN"}
{:rc=>0, :p=>5, :l=>1, :st=>"MISSOURI"}

a****l
发帖数: 8211

来自主题: Programming版 - 几道面试题:memory, sort, 等

I think 1 is totally BS. The theoratical minimum of sorting is O(nlogn), so
there is no O(n) sorting algorithm at all, unless you are not trully sorting
a "random" array,i.e., there is something unique about the array. And, O()
notation only matters when you have large data set.

c*****t
发帖数: 1879

来自主题: Programming版 - 几道面试题:memory, sort, 等

Go read radix sort on wikipedia. O(nlogn) is the theoretical minimum
when only considering comparison operators. Radix sort works on integers
because of use of additional properties of integers.

so
sorting
()

c********x
发帖数: 84

来自主题: Programming版 - 几道面试题:memory, sort, 等

1.给一个array with random numbers, how to efficiently sort it with O(n)? 说
是这个array size 较小.请问怎么sort?
buble sort, o(n) - o(n2)
2.给定一个64kB buffer及其头指针, 要在里面实现类似malloc(size), free(ptr)之类
的function., 而且只有连续未用空间>size, 才能 malloc(size).他说用linklist实现
较好.还说malloc完要加一个结束符,这样 free时能知道赋值时有无溢出.他还问我如何
实现这个linklist使得不用每次malloc 前都去从头scan有无空位. 他说的挺快的,我没
跟上.有人会作这题吗?或从那儿可以找到答案?
you need to build a table to maintain all the pointer and the size
of block.
3. 如何判断是bigendian, little endian,
int i = 1;
if(*(char*)&i) ret

s*v
发帖数: 3

来自主题: Programming版 - Sort cygwin question

When using cygwin "sort" on windows command line, I can't get it
to work with TAB deliminited docs.
sort -t '\t'
sort -t "\t"
won't work.
Any idea?
Thanks.

d**s
发帖数: 920

来自主题: Programming版 - 如何 randomize 一个sorted的文件 ?

如何 randomize 一个sorted的文件 ?
有没有现成的 Unix command or utility ?
For example, I have a sorted file, it has entry:
abc
abd
zzy
zzz
(It is a result of unix sorting output).
Is there a unix command or utility to randomize lines in the file ?
Thanks.

z****e
发帖数: 2024

来自主题: Programming版 - 哪位大侠给说说何时用 merge sort, 何时用 quick sort，何时

list, forward backward access: merge sort
vector, random access: quick sort
right?

o*z
发帖数: 1078

来自主题: Programming版 - 如何sort and merge n 个sorted linked list

如何sort and merge n 个sorted linked list？
假设n 很大，比如million, 用什么结构和方法优化？

h******3
发帖数: 3

来自主题: Programming版 - 如何sort and merge n 个sorted linked list

二楼的办法 time = O(M*log(n)) 假设n个sorted list，一共M个element。
因为 merge 两个sorted list 是 linear time。而每个element会被merge log（n）
次。
三楼的办法 best case 是 O(M*log(n)) 。 average case可能要差很多。因为分区不
均匀。
内存都是 O(M）

X****r
发帖数: 3557

来自主题: Unix版 - sort problem

sort -n -b -k3.5 | sort -s -b -k3.1,3.4
( GNU sort 2.0.21 )

p*****t
发帖数: 35

来自主题: Unix版 - How to do that using "sort" command?

the behaviour you want is the default behaviour of sort
so just sort the file
jumpstart8# cat kk
1 2 3
1 3 3
1 4 3
1 2 4
jumpstart8# sort kk
1 2 3
1 2 4
1 3 3
1 4 3
see what I mean ?

h*****G
发帖数: 113

来自主题: Biology版 - cell sorting 问题求助

hello,
最近需要用cell surface marker 细胞染色，sort positive 的细胞，然后重新plate
培养。每次sort完细胞，重新plate以后，细胞总是大量死亡。希望大家有什么建议。
下面是我的protocol
Trypsin cells. Stop reaction. collect cells. Wash once with PBS+0.2% BSA.
Staining cells in the dark at RT for 25 min. Staining buffer: PBS (w/o Ca2+
Mg2+) + 0.2% BSA+ 0.09% NaN3.
Wash three twice. (500gX5 min every time)
Resuspend cells in Staining buffer. Sort cells..
Replate positive cells.

G***G
发帖数: 16778

来自主题: Biology版 - sort vectors

given a set of numbers, such as ,1,5,4,2, we can sort it and the result is 5
,4,2,1 by the descending order.
but if given a set of vectors, how to sort them,
for example
a1={1,1,1,1,4}
a2={2,1,4,3,2}
a3={3,2,4,2,1}
how to sort a1, a2, and a3?

l**********1
发帖数: 5204

来自主题: Biology版 - sort vectors

How to sort a generic List
After reading this post from Steven Smith I thought I should write something about it.
Sorting a generic List is pretty straightforward if you know how to do it. With C# 2.0, anonymous
methods come at hand, as well as the little known Comparison delegate (check out this post for more
information about this class as well as other useful classes new to C# 2.0).
Ok, let's suppose we have a product class (let me save some space by using C# 3.0 syntax).
-----
Publi... 阅读全帖

s*****3
发帖数: 41

来自主题: Biology版 - single cell sorting

准备做single cell sort。把细胞sort到384-well plate里面。如果想要的细胞比例
在30%左右，如果sort 20 到 50 plates的话，需要多长时间呢？有这方面的经验的
讨论一下。谢谢

s******t
发帖数: 150

来自主题: Computation版 - c++, sort() 为啥显示结果不对

ifstream input("story.txt");
vector words;
if(!input)
cerr << "Could not open story.txt!\n";
else
{
string line;
while(getline(input,line))
words.push_back(line);
}
//close file
input.close();
cout << "Before sort(): \n";
for(vector::iterator itr = words.begin();
itr != words.end(); ++itr)
cout << *itr << "\t";
cout << endl;
//cope sortWords from words
vector sortWor... 阅读全帖

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天